From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from PH7PR06CU001.outbound.protection.outlook.com (mail-westus3azon11010018.outbound.protection.outlook.com [52.101.201.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3A3E437A824 for ; Fri, 3 Jul 2026 12:34:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=52.101.201.18 ARC-Seal:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1783082056; cv=fail; b=XKYQrcJZ0csRMfQ7bFsnId+nCG2IDc794ZK8QdG4JlqkYnA/Y3qY2RMKK0WTS+JSSyxRLNXWSdB9u1pn+r2ePPBk/sF5pLWm0zuNIv0+1vrsApqquJ+pgd29+cRKF6tq+2HMeB7tg9tykXpBM8QE/yjOGJKN/YemGDUWL78bryc= ARC-Message-Signature:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1783082056; c=relaxed/simple; bh=m1/RyG5ezzJ8gNgorHPT8EWcVr5XwNEegjZlEbJR2PU=; h=Date:From:To:Cc:Subject:Message-ID:References:Content-Type: Content-Disposition:In-Reply-To:MIME-Version; b=EBv2OYebuAWaxaSFtBWkLV9JalKUvUKnAyl0hjoAg33UVWfGdMoExaCeDxbwdNHcvkTtypDwbBVCYIoCeBWCLi9fUahR1MEmUpqdtnqCHWISC7uhI1Ua9PAmYJsBCGa1PeKeLv7IYbgnR2rQavQ40BctgO6ihb5XLxf1LvdlfaY= ARC-Authentication-Results:i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=SHWT6DzX; arc=fail smtp.client-ip=52.101.201.18 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="SHWT6DzX" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=Gqr4WFFxkA1Uccg9dwXA2/Ol04yw+1WyBSupQAUg3gKkaNEXehG+97ShbSA6eum5P8uAzoCQ1TQGcGoqWlv9vUb7dWOm7BEOsny8zivTZU4dCZphmVV0bgnb59asjjads5gndHbixBlheiyLe0UiC3rm/kd+Y8icoAmJ46UScQlHCY3KakoKkxRMevOMtWpgubq/u7VUfUVA5Jc0FnYiDSuCiNziD2ldNVBJJ/En3eL0TXPJrsluCu2fMXd1JJ4uslqXr+5PLiHOeDpLapDnamaO92ZS4QiqkakdGHXE/T0Ue1UQQL/iG0MHMyq0YYoWk6EBZaiF38xA0YXzAJh1QA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=av9JafymgcFH5EKDdrp+7ZLjVhqvOAnSIFRd5zU1wnc=; b=goWMroJf57f1UjxZ/64UdlfxPNnySYCPMUq5yWVdSDnAVGzjSDjz1/6T32kO39ADgnbwugomShY1FAcGuOdauRmW4BHsPu3DiRckRxaEDHEyTMA5PrzzYtfUzKV12xsgSqCKK4/lMWRA93Po/6adW33+X1S9h4QTQ69RmWgYindmYNHPCNNpGdgIfFN8jG8Bz+4RIvdz40o7El/GMOEUTMHHWtejdrs76EUFomzNANZIN2sT7Mx5JxgaEvmpCpXw+JZM3O4ATRZ6wnTufW8RPyvSSSTrOG3GWIV2XtCetYYQRg+HrQ7/6VtU1Avxnb8QDWCniWIzXtUP4s2jJmm/Xw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=av9JafymgcFH5EKDdrp+7ZLjVhqvOAnSIFRd5zU1wnc=; b=SHWT6DzXIDBYd1WSrM1k4+cWPI8OgtbHVa6/wxc+Y5Knun1dweK25wpfx5C0Ksn/QOBEesXGRmhObaVxDMnLDYttNX5OjttOz14S2hszKheCID8csXlsQQuP+uHabmdLe+ClwTvlS2a2HWib8894fHv85ik2zIR+TKrIUjOKy1Brn5fhO+7CPYnYSspoIq+4jGk/dkWrL5oU1hpLiZkLvqyWHNtAJkBpjvZiBLe/dApYAI2AJdkmNwvQLC+f2T8e/mVLOledm1AMPwL2mpH+A2YjfZonQDCGKF7YH81Ktk4WvNeJRmuUkQTm4Sa94oAHmN/7/rfoc82gGH45CRy8mQ== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DM6PR12MB4827.namprd12.prod.outlook.com (2603:10b6:5:1d6::14) by DM4PR12MB5961.namprd12.prod.outlook.com (2603:10b6:8:68::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.21.181.10; Fri, 3 Jul 2026 12:34:00 +0000 Received: from DM6PR12MB4827.namprd12.prod.outlook.com ([fe80::6261:3040:864b:159c]) by DM6PR12MB4827.namprd12.prod.outlook.com ([fe80::6261:3040:864b:159c%3]) with mapi id 15.21.0181.010; Fri, 3 Jul 2026 12:33:59 +0000 Date: Fri, 3 Jul 2026 14:33:49 +0200 From: Andrea Righi To: K Prateek Nayak Cc: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Ricardo Neri , Christian Loehle , Shrikanth Hegde , Felix Abecassis , Joel Fernandes , Phil Auld , linux-kernel@vger.kernel.org, Julia Lawall Subject: Re: [PATCH] sched/fair: Stabilize idle SMT core selection with asym-capacity Message-ID: References: <20260630152747.128746-1-arighi@nvidia.com> Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: MI3PEPF00007549.ITAP293.PROD.OUTLOOK.COM (2603:10a6:298:1::4d2) To DM6PR12MB4827.namprd12.prod.outlook.com (2603:10b6:5:1d6::14) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DM6PR12MB4827:EE_|DM4PR12MB5961:EE_ X-MS-Office365-Filtering-Correlation-Id: 6510798c-ebd1-44c3-b352-08ded8ff5ace X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|23010399003|7416014|376014|1800799024|366016|56012099006|4143699003|6133799003|11063799006|18002099003|22082099003; X-Microsoft-Antispam-Message-Info: tDsTXONDpEERAuzl+cfK+Vgj874BYfBdUne9nLDjOj9U+WhCTaof6+kqbcKzV4qdwKAOSTVs+shFgw1t6T2z2j/9nnGaosbQkM2UBaAyC/bjFnDkvv7b+RywwjHa0cVJFDekKPpzVdcbEDYDz3gm/GzjSZQKLZDZ+qJgDx/ABMocAXN/YdIu5bMp2KQMlw+EovV9LdZB6JKiNQh6xYQBJ99TP9KfIEd0ikJaBYwxtGSfxmLYkmMI1FpQ0taXxyOEBnB8XiTZuuD04K5MDDbDVIiDLgohiMVsQwncZqrczGYioXwISmcw7WXipTHL1oBNulSNr0lwt9r9npAOL7LzlxfQkweV0f1na15m56xD6ipmcGVFBU8eL9AX9D8B4MKt+UjBsuInd1qBVnErnW5gfq/PnXfz5I8lcL7GuuT8tv70vgizcef8j0U6kQ5JtyVKCD438fBCnRPUYFbXeAgxmFhbJUblm1XFLvWM75BCsbzxjm4C0ugqB80AxVH6gDHCo/KmMhPjBaFNzgjNQM9Mp+mDo+dg6cFOxNzZ+kHR8SPwxQAppgfQe3A1h5Kp5AteeZbVpWQg0YVRM2ubtj9OqLznZAW5jlNJoIhrf2QjtCi2s4HLXgvYoUe/BpEV34TEHC/xFYqBxnM/7aLUC316QK9uVatpY6fFl1AOtkg0GPc= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DM6PR12MB4827.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(23010399003)(7416014)(376014)(1800799024)(366016)(56012099006)(4143699003)(6133799003)(11063799006)(18002099003)(22082099003);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?7gzNpp87H/PuDhQVGxsGYD96iYGqWIItrkn40LNekhFTxJdcZ+0WCCeSC5sR?= =?us-ascii?Q?ukSmvkUmMhVJE7y3YGmpeA7++xUmOGKvC9Yh0+tLDGxDIU8nLersHdM9oNs9?= =?us-ascii?Q?gET3BJL5WdG1praoHXkSE2FPOR9BRdRQIbPIYK0J7j4mkBgk/CL/YxatU9Af?= =?us-ascii?Q?0ADz8hofHuukvSo6VdVMyZS2zA6g6fcTHiUhCo5KHZxUyB7H4DhMLzA56+PE?= =?us-ascii?Q?5Luq7iqKI/kjfTiDHL0XLHC6a/yLE9Utfr9f6iGy3y23bDYzR7j3TY6UVUG6?= =?us-ascii?Q?QGf5v2wU/NZiZ0SrRNM7WnM0YyUDUyk50LEktQ4yTQptnO/a0Edt+G7wFKAu?= =?us-ascii?Q?X2x6BsEib3xtVIpwxngfMe9jyC/AW1O+aKg6jRHfXQX7F5B7s1JsiLYQsat+?= =?us-ascii?Q?gz7VpDon87x84NDP2/T6Z95iXqI3uJAvbhbab1qZdozsZidvGf/OYr4B3GwJ?= =?us-ascii?Q?+HTbkUB0uaqR+G1JSl5OUSMUaLsdjKBtZkmYz4DSqww5151dtnDxshUUQpos?= =?us-ascii?Q?msbkshyfu1fO8heFASyC9u9C9SlP/7zdwebnZNE6UHib0tJ0HLGFA5FBC8rU?= =?us-ascii?Q?0W+Dw2HKaSZc1lXXN1xh+zMYIvF+4/SriXtizjtFFqE20TIg7AhY0DYY5JGD?= =?us-ascii?Q?6dxa9gpU4OvMXYjjWcEnTdwLM9gcLmBujqd7hzG60ZuqG71uqsRQas9xlnme?= =?us-ascii?Q?v8HCtIkVHkoeZR1Bu94VyJDzLEahbOX55ZwJCcg8BpzOi8D0qrvwSsEZGVFM?= =?us-ascii?Q?lXpXjsqmxWn1cM0W7C4LO4mXzKUbX5XB/dDfAnxgxt0zuUW7cAx0YNtgAUET?= =?us-ascii?Q?z+3SJcbhTAgmi8kUJVpneSdiKxqy5pMEaAah7+m4BE2z7gsy7eivXpp/wdQ+?= =?us-ascii?Q?rhuOuFKo+hpjsuK5Q0P6VdtfAFTMyh0fUOKo3BPFy3wgywSrXsWhP/72Tgxh?= =?us-ascii?Q?89sx46HfMm/oH4hv5VY/kb3H1EUPlXx0jjvEdiTRiDvVZeCnPifXSkPRG7Qs?= =?us-ascii?Q?ZTJCAzaBMHtMugbEK7jsKOzv9hmfbj5nErjy0AUkdvoyEDARbe2i16siNy5x?= =?us-ascii?Q?2Vxgy8EhZcRxKbxlmfkA8IzHPq6YCRf4SuNcOshV3mN5G7Crk4X58Zf8RicC?= =?us-ascii?Q?m0zTEP52VDwoQbuVUOBJK7jE6CcLcpjstEXrzTp7drBqhDs62wElhWM/a9s+?= =?us-ascii?Q?9pClxQZtGJEUvJeBr/3wj+4+5HR49rA6g6MmrGfHQhPhQud4xsAfppiVFwHB?= =?us-ascii?Q?lAEP4nf142fg0SrUOJKIFxHsabB+iLKX9r8G0wNiccAL0ePrctDyFwfvlxav?= =?us-ascii?Q?eTe8bjVuOIK1x7EaVVZs1/dwuYAzMlBtSB36LS8WDtASdfFuNKCtrMgnqvdQ?= =?us-ascii?Q?t2q5bDbGy47/Hi7iHR0KeKlTian38GXskH4A7xjXJIc5ue7DGpHOz/m8mauP?= =?us-ascii?Q?Ry1GX5beh1onuSghCQTq84QNqzd0MvbOakIkVj60Ekek5KUy3UR9zFM2FAiA?= =?us-ascii?Q?ed6hXNLc+Qp42RYYi/a0Z3CY/G6eRoIksj9G+pQpFOMaP0y8MuObMY2M5IuR?= =?us-ascii?Q?P2EvlI294WEYs/qcG5h519JC1ydttjEYr/ywoRqh9ezbUOvGJXgOHBFxkZms?= =?us-ascii?Q?+1QypYkJG5jI+on7tqqJXubabZs+AZZkgE+Bsl5BKUYnJ0Jdc8iA+YPeBhrK?= =?us-ascii?Q?+yGcs5E9wWQRhzw6HbCd3u5GxZlP/dGDa/yvVR6bD8OnKA7+3GTbq3UIn2Mw?= =?us-ascii?Q?voufN+mNKg=3D=3D?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 6510798c-ebd1-44c3-b352-08ded8ff5ace X-MS-Exchange-CrossTenant-AuthSource: DM6PR12MB4827.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 03 Jul 2026 12:33:59.2569 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: i6Dmpw1srmdErXYv23mfGk8K7HEI7KopmRMEgOof8HtuBMYkZpu53nN75n9iCoY5HPDj1Pf/KiVRQOzsJqPAMw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM4PR12MB5961 On Fri, Jul 03, 2026 at 11:40:28AM +0200, Andrea Righi wrote: ... > > > On NVIDIA Vera Rubin (arm64, 176 CPUs/88 cores per NUMA node), a > > > CPU-intensive NVPL SGEMM workload restricted to 88 threads (one per > > > core) showed a consistent 23% increase in mean throughput across > > > multiple runs. > > > > Interesting! This reads like active balance across cores is not aggressive > > enough for this workload and, as a result, stacking somehow helps. > > > > I would have expected balance within the core would trigger first and that > > would just lead to the same scenario as both sibling sibling busy but I > > guess there is a higher order effect of stacking. > > I think the key here is that temporary runqueue stacking is preferable to > consuming both SMT siblings when fully-idle SMT cores are available, more than > having benfits from the stacking itself. > > > > > perf sched stats reports for this workload before and after > > applying your patch may help to see what changes for the load > > balancer to start doing better. > > Ack, I'll collect some perf stats and share. > I collected some perf sched stats diff with mainline vs patched kernel, here's a quick recap of the benchmark results + stats (I can also share all the detailed stats if you prefer): mainline patched elapsed jiffies 17472 13808 average GFLOP/s 6297.62 8423.60 sched_yield calls 11.47M 4.47M run delay / runtime 0.20% 0.31% timeslices 168 562 Across SMT, MC and NUMA domains: *_lb_gained 0 0 alb_pushed 0 0 ttwu_move_balance 0 0 The schedstat comparison doesn't show the load balancer moving any potential stacked tasks: *_lb_gained, alb_pushed and ttwu_move_balance remain 0 across the domains. So the gain doesn't come from post-wakeup balancing. The only clear difference is that the sched_yield() rate drops by approximately 51%, this might explain the speedup, but the stats don't expose the CPU selected by select_idle_capacity(), so it can't directly prove if the placement was beneficial. I'll collect more stats. Thanks, -Andrea