From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from PH8PR06CU001.outbound.protection.outlook.com (mail-westus3azon11012052.outbound.protection.outlook.com [40.107.209.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AF3B43F0775 for ; Wed, 6 May 2026 18:31:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.209.52 ARC-Seal:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778092284; cv=fail; b=RP4d3a5sMbJ75PBHA4dR8HYbMC3cojXgKfip1i3xa5E8inQYsc9IcCNjmdFuzyhpyPemS0NhF7nsG6C59AgMCr9j77ut0Y7HPC+qufIcpdJ0Z7pLEVFDUEMBN4i45zyQnq3BHWhtYkn1msep0n8AkmFYOTJCQPnb2vjc8F+88ao= ARC-Message-Signature:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778092284; c=relaxed/simple; bh=5SrasvH3FfaUaeEEy0BusrAkH1DA/qUIOfPQzAdmWvo=; h=Date:From:To:Cc:Subject:Message-ID:References:Content-Type: Content-Disposition:In-Reply-To:MIME-Version; b=kEPM6jJwnnYJCjha8Gn6BEcUsheTLEmchBZOo/s0uWLLbN3k420SwxMDxTknhW+VjCCu4Ir0cEpMQ8EbdHtGQFIijh1Tfng7i8FBDJLFexW1E7sAd5cyQUq206CN5Gt/f9o9/1ZVZu5BJeYxU6AO4M3wA84cOhg/q18BEI+KmSo= ARC-Authentication-Results:i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=GjfZBPlC; arc=fail smtp.client-ip=40.107.209.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="GjfZBPlC" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=aMhzSgmYOLnw6axJR7y1xNHodndsyMNetAXwjvc4Tna07S/IdKu8bCAAlM++1DxkPm2yuhCb3xQm7acc17+oCD9bh7Z7SC1k7h1hchp5CJ4G+Sr4p+gMreSQT7E1faJqdmChMy52wNpz2JiPagQ4ZLi67R8zi+HcpIfvrg2TcKQPe6vpSzruScKGBfpAErf72CbYnYtrqBxzwLlcMf+q7D8KXuRqsFWVz52/EVmX65AycTL1tnnOxzP/VIsQipB6x1wXTHdLLVs9Us6W55c9YhwT2g9iOrTp7LCUDHciV+CsSOW0Cp6pAD1/Jk/dKPYm2eWC147HvZuV1OJ4WEhUcQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=7O6SbRExtK8JhbeHuSi+Bmvi5hEjfEHe3DYDhj8DW14=; b=fMaRAQMr5l070uKsGndUkCYfUTAI6sfLF9jOFdEH6X6TxoZrpwFR+nM7SpmW4OKE7mrlQ7fHRKMEkRIC9be5M/2BggCdXKP2UFEf3QTUlr0QM/zB3g9Pl4kNPprLeu3laTQ8oFxW504BeaLBiE0avZTKvHCFE5VAZTEyhlzqxTEgbSLFJHjEedj6fhYXwvG2tpL27m56KSYD/MnUPh7KdWzCeLrn8HVsp8tVwdFqAliIuS/UmZBTCWvN8+mBNt7Hud5r4fEb+9lfPJ24daLW73xlJ2hydNzMwSLS7f9P5lVqrZ1kavqd4jjUJsCZYtEKYnJrQZA57fKcz65WiDMAag== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=7O6SbRExtK8JhbeHuSi+Bmvi5hEjfEHe3DYDhj8DW14=; b=GjfZBPlClDI87V9GBSVHSUBmshBIdBFTqIjs4vbS6MA+lB7HTt9i6LjKjv0OQ/6q7+yL3JuZLOqK31Mv9AEws1mH/8zsdHY2YAIPDtjeuoMa6slr0gv3znX5zTqtgbavBRsPuId8cV+MY1Kfu28MHudqg31orHQoTtFO8O7VGuS/+5o5CGL10COGORin1akRdBFwbDuD2Ykrf9ZMJPygInpTj9GKeLA/xMVmoflO/AQUkMNSPdcwZxdyyNYOAgg1ApbZeF4tQMLoK+x+AqrKcTutNu3FE0MV3Q/07w5YbNlPqol8XsH9oz1JIRPIAtLLm62dUv6ShZ6II8ony2Lfag== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from LV8PR12MB9620.namprd12.prod.outlook.com (2603:10b6:408:2a1::19) by DS7PR12MB8204.namprd12.prod.outlook.com (2603:10b6:8:e1::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9870.27; Wed, 6 May 2026 18:31:14 +0000 Received: from LV8PR12MB9620.namprd12.prod.outlook.com ([fe80::299d:f5e0:3550:1528]) by LV8PR12MB9620.namprd12.prod.outlook.com ([fe80::299d:f5e0:3550:1528%5]) with mapi id 15.20.9891.008; Wed, 6 May 2026 18:31:13 +0000 Date: Wed, 6 May 2026 20:31:09 +0200 From: Andrea Righi To: Dietmar Eggemann Cc: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , K Prateek Nayak , Christian Loehle , Koba Ko , Felix Abecassis , Balbir Singh , Joel Fernandes , Shrikanth Hegde , linux-kernel@vger.kernel.org Subject: Re: [PATCH 3/5] sched/fair: Prefer fully-idle SMT cores in asym-capacity idle selection Message-ID: References: <20260428144352.3575863-1-arighi@nvidia.com> <20260428144352.3575863-4-arighi@nvidia.com> <803d8684-585e-4f41-8d9a-d9984923c3f2@arm.com> Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <803d8684-585e-4f41-8d9a-d9984923c3f2@arm.com> X-ClientProxiedBy: MI1PEPF000008D3.ITAP293.PROD.OUTLOOK.COM (2603:10a6:298:1::433) To LV8PR12MB9620.namprd12.prod.outlook.com (2603:10b6:408:2a1::19) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: LV8PR12MB9620:EE_|DS7PR12MB8204:EE_ X-MS-Office365-Filtering-Correlation-Id: f36dae74-16ec-4521-878a-08deab9da719 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|7416014|1800799024|366016|376014|56012099003|22082099003|18002099003|3023799003; X-Microsoft-Antispam-Message-Info: DNSnggziOcLIhX6gHboVcxQxKEwdqB1R3coEczeuXkEPuWPf/vqM4I9k9+mna0tpx2UPrsOshgotEhIgOrtxj1TTL9a96MlvTPNbJGHqQjywlvH18rEyeELSeT/9X6QtCySnS3poXM2pBbHlk8YfWpJK3XRSbKNKPiGZ8Kttaj/soxrM1zDqaokkpH71hHJtUmBdT3QtnDdU9qtPQWo/T24HV8rSPi91DrY5SQDeFD3VkxWG6rXdqaqcx5byGpN1w+xTYwYye/7F7s/KwFgym5CT23RAn9hUQVn6AeFKy8xWaCtSUYB1GHZvXekmMGq4hziEOFciGI/8zZxIlE6BTmWteJgFQLSa0OdICeAzmqFk4NKgI9qGC+d1O+RmlQYuWNkPeCzkd9dp4yCERjDTfun3cO/4sOTF/OR2TRdXRd5Szr4t4cKbFJqOJE2NYs0mFCP7FzTzzFTUjsOKeyJEUCUAMMxkVuYwfjfmspfJDLjZ+b0sNXsrRC1LGgrK1BmRYcJQSYoJfpGqt6ec4AR5WYB0wjBfWUDE4LiZMvGJ4qezO9OtczpPUrCb5xYoOr1CWZF0vKHkcJfzP/IPz5x2DBMYlewfD5DDpdoMsbX+QaZM4/OMXECbx9jK2zKzHY50bjFW5rYJe2qdNIQX8jOlnmRjnOQGS8kVYDYdqUJxh29MHZfz+WQ2aRRaIadKMqMa X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:LV8PR12MB9620.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(7416014)(1800799024)(366016)(376014)(56012099003)(22082099003)(18002099003)(3023799003);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?lGHcafPno4Cih11FrjOnO4H7R34iGyraHgL96ep7Oo5vGr9oBDIeeSG9dd6u?= =?us-ascii?Q?WjWyhGtYFHKfTEy37aWdcreWth+imGRdB5dXXcCw+KtZWzZ08NLgdUIEl6ur?= =?us-ascii?Q?KwueVd5D7g2huHgeTgmU8Z4Me+juabLKE04JjqtYQdWnoipZFgPWkBrE6a31?= =?us-ascii?Q?/lbtDy9RScsY5vxZtMVdB5KfBCs4Qo10nZpf37REORIJdbSvf3dSjq9SCXHk?= =?us-ascii?Q?LMLo0bbyQjvm+JABrRCteTb4hqtd92wyEKVHA/xmqgFUnTeRzul1BplvJp2R?= =?us-ascii?Q?keVcyArNaMatgxTSbS75SzlD9Elsn6+JMPPPdfS5cdyEJEVYHjHrhGUGOj9H?= =?us-ascii?Q?R0jBc+29YyS9wlyVJu8HsNgWNxx5dPmSToYh3L8/nwSx3xlifIjJ62B9I50a?= =?us-ascii?Q?nBz89472rhSegL1YT4CPKxQ8e64GyUGU4DK0X3FSxh+yrxMcQpUf2LWnNx4c?= =?us-ascii?Q?IuB2spH2L1UvWl9CTLizSzfy/q+H+jeOwnRnfmgKPpz/26ewjQ2Wo9fNWbE6?= =?us-ascii?Q?9F5tab6pb83bq7kRJUZ29U5tUezLoLAg1vWyevev9vLnSvp9v2qrZNFiSB6d?= =?us-ascii?Q?x/D1cK/hpj57HqwlpSqUReN1dITIlIRM+jL+9/BC/geC6L0DAgt+9rBBga/X?= =?us-ascii?Q?UEs6OfyeTFoA+dYQbimHoSS/zPSEzLoGILc8Wl2OzXaIK+la9cC7XFZaqUGn?= =?us-ascii?Q?vTpEXrMN3WZRf/8XVZs4KH5WbqYsSTNDvsW5bvPUPvSYDafUHUJOBHaFD3nG?= =?us-ascii?Q?YPwlgdFWJ0VQkZEVHz5KfmllRJKvwar9lfZQJw1MSHUF99bOM/oE4PrLKBYj?= =?us-ascii?Q?UIyo4MWB8dxXjUlJWcT6vClT+ov5a+/ODpENdy87GEuxDvU8MuNY2NDX/oqf?= =?us-ascii?Q?Y8I6W3LPDoKLwJsnaM1FvqAFmIT6qDRkfaGY2alEPOudE2+bZpyoGDes7F/Q?= =?us-ascii?Q?Tcl0zzTK/89uLmJ6H0FFLGS6fLXDc4L3Pdnl7b201cknF7biR64FVqoEd/r3?= =?us-ascii?Q?/X4iXf0o3zwZ9pZUkSREPg/Ee7zFc0bR6oqAjx1sXzLXgLYlUW4f/QXadtxN?= =?us-ascii?Q?S4NdNriod5FNxRYTonLSY4+VSp6vauVNP+2QO3d1Jmst1oECLOXYmOWtBguW?= =?us-ascii?Q?8ITyjrNE1muWB9k+D/hTLB0xRqwOkqs9VRuB/Rkefg0PHFC2LlCjEwqTbH8w?= =?us-ascii?Q?wGbVo+beOMHGwL3MbyfFRKC00Kb9JSSvVXZjjxoQfab01Ai2k87LvOhb7QFf?= =?us-ascii?Q?esS+Ywksd8gM9gIGlTMML08/brBn5aqPHSQeSicvm1sgbRqGQOajrm95dV+f?= =?us-ascii?Q?rwusB6yLt0AjIUwab50t5G/qdsFIyQL2IUla82r1JS0s3r393n6VVWwk+uCx?= =?us-ascii?Q?fjsB/SsS0+5P2/RwdHO2cQHLgzS9R07E2nZIxYSOR15+TJonlDmP+ZhLxdfn?= =?us-ascii?Q?/MVPCnS/CEd5fO3l+YTdMXSM073/bu2/ZAPCjahQVSK8zRupbhnIixNajWkz?= =?us-ascii?Q?wmoeXdDzICdX4iAyPFpOcW25uAN8Sfo+ykf4L685jVl0qenhQRiLz0B7z2gB?= =?us-ascii?Q?3oVvqQkXmagp8tOPy6OJM03d+7dd8rOWEZnfYQNZdG4dEwVnad4oaagcDmSS?= =?us-ascii?Q?A5m8sJxNu6OLaBFstlbjdQNTwIHJ7pnIZfQk2FKDKfSp64Whfo0Z/IAGpBHu?= =?us-ascii?Q?vTxFXSmusExTiApRieZMyKELw350f66w3BZMI155Xn+gDMSusdWoQ/DKifZD?= =?us-ascii?Q?gPrila5cRg=3D=3D?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: f36dae74-16ec-4521-878a-08deab9da719 X-MS-Exchange-CrossTenant-AuthSource: LV8PR12MB9620.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 06 May 2026 18:31:13.7472 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: FDyi+JvKXEQ6Xsd57OeYzSvQ/6irVa/9SbniE8h5VWrDQyC8ATNfy0XIZh0vaJliO7/zm3nadxpsYiwjkjZzOw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DS7PR12MB8204 Hi Dietmar, On Tue, May 05, 2026 at 07:20:35PM +0200, Dietmar Eggemann wrote: > On 28.04.26 16:41, Andrea Righi wrote: > > On systems with asymmetric CPU capacity (e.g., ACPI/CPPC reporting > > different per-core frequencies), the wakeup path uses > > I assume those CPPC systems w/ different per-core frequencies (like your > Vera) are the only real one which would make use of this. Mobile > big.LITTLE/DynamIQ don't have SMT. > > Phil mentioned other machines (PowerPC ?) which had issues with using > select_idle_capacity(): > > https://lore.kernel.org/r/20260325124840.GA98184@pauld.westford.csb > > [...] > > > On an SMT system with asymmetric CPU capacities, SMT-aware idle > > selection has been shown to improve throughput by around 15-18% for > > CPU-bound workloads, running an amount of tasks equal to the amount of > > SMT cores. > > Just to make sure, this should be your internal NVBLAS benchmark. Is > this 'ASYM (mainline) vs. ASYM + SMT' or 'NO_ASYM vs. ASYM + SMT' ? I > try to match the cover letter's table numbers. Yes, the 15-18% is with NVBLAS and it's NO_ASYM (mainline) vs ASYM + SMT. The speedup of ASYM (mainline) vs ASYM+SMT is like +60% (keep in mind that with this workload the SMT part plays a big role, because it's creating exactly nr_cpus/2 tasks => 1 task per SMT core, hence the big speedup number). > > [...] > > > @@ -7997,8 +8013,9 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool > > static int > > select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int target) > > { > > + bool prefers_idle_core = sched_smt_active() && test_idle_cores(target); > > nit: why prefers_idle_core and not has_idle_core like in sis()? Yeah, sounds good, I'll change to has_idle_core. > > [...] > > > @@ -8047,12 +8102,17 @@ static inline bool asym_fits_cpu(unsigned long util, > > unsigned long util_max, > > int cpu) > > { > > - if (sched_asym_cpucap_active()) > > + if (sched_asym_cpucap_active()) { > > /* > > * Return true only if the cpu fully fits the task requirements > > * which include the utilization and the performance hints. > > + * > > + * When SMT is active, also require that the core has no busy > > + * siblings. > > */ > > - return (util_fits_cpu(util, util_min, util_max, cpu) > 0); > > + return (!sched_smt_active() || is_core_idle(cpu)) && > > + (util_fits_cpu(util, util_min, util_max, cpu) > 0); > > + } > > Not sure whether this has been discussed already. This makes all early > bailout conditions in sis() idle core aware for 'ASYM + SMT' but it's > not for 'NO_ASYM'? Yeah, that's another difference from NO_ASYM and I think it's worth a comment. Maybe in the future it'd be interesting to see how NO_ASYM behaves with the same idle core aware early bailout conditions (not for this series I'd say). Thanks, -Andrea