From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from BL0PR03CU003.outbound.protection.outlook.com (mail-eastusazon11012049.outbound.protection.outlook.com [52.101.53.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 47F6A3AE18A for ; Tue, 21 Apr 2026 09:35:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=52.101.53.49 ARC-Seal:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776764139; cv=fail; b=AlMlLr15eDCK9yTl+i9fRUnryJ41lL9TIXVGu3Z+xNO90fGzMzAmgRs+dAj+yADaLCGfQxX+aR4uqu5YB/sH2Eg17R3gDN9e6vuJ9UEapOuMOBMjltfHbRmP/jdzWvVA24A5gSk91juqDYnTDKHrFYXiKsT76uzMB8JfOOaq7TE= ARC-Message-Signature:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776764139; c=relaxed/simple; bh=CtTTzOtKC/M/ZjxpEH3D79SFnSbLRL/UI2r0VPwYYfk=; h=Date:From:To:Cc:Subject:Message-ID:References:Content-Type: Content-Disposition:In-Reply-To:MIME-Version; b=jlMRV/V49mgf3d76DmIQygeAYGhfA75/VYmCIW++I/1fUBP8X5D522I4O5xeENtkOt1asGHDfbEBSDrfmq7jzbX4r+Q+p7J2DZvvBaiVlpm1m5Pt3lNhrEn8ELyWTfx2FPFyFcyecQouOtv6NDWOJG9eKwJZ3UTFNANqXSyh6wA= ARC-Authentication-Results:i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=AAr8dA7Q; arc=fail smtp.client-ip=52.101.53.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="AAr8dA7Q" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=YZ+UUdC8xzmiPR2NU2NjZ3MeFo6T29/+/LoUZLNMQRiswErWsCZ2ydVo44K07MvMhpTaKpeK532Kqcd04ozk3Ouhw69JaWkj8kTr9jAyfkARWWdPyUSp4fq8Z5wPYexQr9P7z28e5OrC455X5Ui5uBzRwMsgBZ6bQskmP6H6/5xoBjcVwXHB+WdXbDfFpnP+hd01M0aazpYdm188i6yOw0AkmAUrGqmGO+2XBikiviggP+1o1J913+XmdznQ4z3CB9pvL5nxanJyfn9+7MgJ5AAkCRDXobuzgVWQKmPPq07k/Vk/ddIl41WaOKOJc7PVLwOzVpT+Q08h/Oh++ORwEA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=X90TTY4z/tu4i/d+uNzPx4ZbVqzsKdEzCN2pHVKQE18=; b=rnjzG1ebmpURhhpU6xt7x/J9LioNj18DvmOtSie1VobH+a6Z2KzY77rVtH5TFC0lWA4aSOkBPaue3vhIICl81bkpk7IEFcXsjVVufznmmw50roiO2bMIwgbnADUcoR2Iqg4d1HGpgqrJGv7+V4yrkiYTg6R9tUzRnez5SY7/4Tpp+FPOYnEQWnCnd5oulZhbwtfbdpUPQ2hhUaBKlCvLz8mbwD4lz/UdgLkg7c4pt6BTJ36pmBfbcG/8mhDb5UT6D3xHw0QYwJkyY3/usBYm3NLGcGeEkykfhwLKuzCUjfx2gc8ifV3veXsC0/D/ksjo9+5CGhbk6atwng1RutWjxg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=X90TTY4z/tu4i/d+uNzPx4ZbVqzsKdEzCN2pHVKQE18=; b=AAr8dA7QRvEoZU2czhiuttczH2h0owKBI75hCyTjArgA2lRvjIJ93i575bzvAO5aAGDEwn7rCQ6OaZVAK1TCA57PBBZTckV37mAFRojT0eYbhOmoaO/dIQrXvZaRCETb66d6IEpbJBb1EFkgQhCUu3noxF/5x/mdKOB2emIb6QrDoGSMxDSqS9yvV/PsL5PxKhjBnJ+eHn5Y1T4FgUMPHbaj8yYyxWeAYwk3t8RrqvczGlwhvwRJT8umZ4ZaCUzKI8/CwnVfAJTnPJuOLa5mB9WLM+fNYWPh1Yp+QtGq7UZLec5y1Vq85nLgY0NWdMMV2Ed1l5rkIhbceHN+Qrh3cQ== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from LV8PR12MB9620.namprd12.prod.outlook.com (2603:10b6:408:2a1::19) by DM4PR12MB6472.namprd12.prod.outlook.com (2603:10b6:8:bc::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9846.8; Tue, 21 Apr 2026 09:35:29 +0000 Received: from LV8PR12MB9620.namprd12.prod.outlook.com ([fe80::299d:f5e0:3550:1528]) by LV8PR12MB9620.namprd12.prod.outlook.com ([fe80::299d:f5e0:3550:1528%5]) with mapi id 15.20.9846.016; Tue, 21 Apr 2026 09:35:29 +0000 Date: Tue, 21 Apr 2026 11:35:24 +0200 From: Andrea Righi To: K Prateek Nayak Cc: Dietmar Eggemann , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Christian Loehle , Koba Ko , Felix Abecassis , Balbir Singh , Shrikanth Hegde , linux-kernel@vger.kernel.org Subject: Re: [PATCH 1/2] sched/fair: Prefer fully-idle SMT cores in asym-capacity idle selection Message-ID: References: <20260403053654.1559142-1-arighi@nvidia.com> <20260403053654.1559142-2-arighi@nvidia.com> <64fe32e0-d428-42bb-beb4-2656d8781b0f@arm.com> <7313ba07-7b87-447c-9c48-2f6b2b53ac94@amd.com> <1230f5df-470a-4e59-8c8e-fa159a6fc093@amd.com> Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: BY5PR03CA0030.namprd03.prod.outlook.com (2603:10b6:a03:1e0::40) To LV8PR12MB9620.namprd12.prod.outlook.com (2603:10b6:408:2a1::19) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: LV8PR12MB9620:EE_|DM4PR12MB6472:EE_ X-MS-Office365-Filtering-Correlation-Id: d37cc50b-970a-4089-6b9a-08de9f895348 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|7416014|1800799024|366016|18002099003|22082099003|56012099003; X-Microsoft-Antispam-Message-Info: JL+Q34OteXFpYABxa5rUauTOnX1Vbf6owmuO8FHj86f56oOoDHFktgEk6yyf05AzbzxFjPDrXSPcFR8MXR0ldcn/G4hL6KY4IVz7UiYb4JAYiCD0Dlzmuz23pFmCve28AeARpd5RIb1E18Z9uxEEMQOqYdCkDnmD1EJeCXmwyYQPOMa10HG27ezsG+LKP3BIm5bAFCNOgDyzCz4K1FrJVjUz91uFCDnzYA7Ww8Wzlvi4Jlt6PvnUfmKzThit9liHV+J1MbPKB+sh/7iFfUf5paOGExwIOgrAJxry/fSLZmT9JzM1toon3GY3uvbBSH4/fgp5pECPOWPTLIw+1O+PPaTOPtdSfJHKAjHq70W5+2x/AebIAB54ngYWmKZCJX10wSK6zWa1Pl5JKNAWOGiKAl5KCoOMbvhgFOQ8zsm65mWwyclJQKc+Kvj3Cz9HIwfo2t57kIbyCBnWFX/EgT4OcQWwWvRWRNEZEDBCdeY1IzhhbU0EX3uT+uIeESazGn9u9KI/JLZoNzocP/I9btsTS0bSAhXAVtCsuS+GiHadfyeLofginpKjHGGtH4YVpGdPcC+8aUT40MwUw5caOLFsn9mBdJmwzsxeo3r0LKfDP4YtJwbvFqW9TsaOoCDGJbuntOpE2k4USq+ols/v0kTsOwPginC1laU4yQlmUFsBSK+PRx6I4hlRHfLv2Tqxp7uGPWlMbXf2ZZ1+cWQtuQe2ofWMl4roWEbpt/Zs9KU2WkU= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:LV8PR12MB9620.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(376014)(7416014)(1800799024)(366016)(18002099003)(22082099003)(56012099003);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?JOqdb3syDIy2IzBaaCZZH5uPEGSuYJawf/DY2LwKonj30LqQqqjuw3OdwFgd?= =?us-ascii?Q?VhGTC8wvz3PNnePk1HxM4qL9jlwHxAIzJIkUQ+d5NjRd42kCW9KtDiKZmXZI?= =?us-ascii?Q?HI1vpHzbzWQIIfjchBGjQmqmce7gI6HLoIf4Ys3X9YsCKVhQTOZBBby6ySgs?= =?us-ascii?Q?Y5IDWOYBRBkL2N9uykIDdcDe3Eu8YhOJa2KWjuX3IAXRJ++jzCU6OuA9YomR?= =?us-ascii?Q?bHmDXfdAevHsl45brDxYyL10XWrze6xZGC3+g/kVdQTg1iZxAQCv2iAr2ggA?= =?us-ascii?Q?sHAO7n8y926chgAq6aQJdLAHQtd/SOfEk1apYGA/PFMQWLx6/LSaMrxeSN1s?= =?us-ascii?Q?ErWGFPay0DAhHN2zKIVkZ3el586SCdKwkGoAnnv6J9P0b0QtV18T0O6l81Ae?= =?us-ascii?Q?9qriYKY7+mtTr1Uu2nVzROmHwh3yp0MXiXvpYJ2JTIVos+rzIligoRgdkFiH?= =?us-ascii?Q?P9ueL/EDKS8f20OX5ue8vcZIBWP6UMdlPl0pRd3D/CGlcULE8Go65MVk3had?= =?us-ascii?Q?41gpwoktIln1lBLuQGp5iqak/tNHO3meWT6DfCQKB1+sI3mqtCr+VlIyIO+e?= =?us-ascii?Q?ze/M7B+ub7X3nkVXfJCq8t/AjAwMrNglnZsH9IQ1VmHTT86f9VHKF1cBSO3t?= =?us-ascii?Q?X0Hntkv+EGx1DjgOyH5y05FthsXIxm9utoJvp3F8yrx2pfePnXECRv5ZDfJc?= =?us-ascii?Q?i/VoIBLxoEKM4IgNw+T1nX/Dt6HiIzWYtsscncLMypAmv1CaduVZ0qgzB27Z?= =?us-ascii?Q?IXAD8kfgKOACTM26EAzarQj2Y4yVto/Ke1b22cCeRhNzLU9t4P5stkejwrC8?= =?us-ascii?Q?nd7pkq1nRt/2R3OR8y4is0QRXZnmBuzrNHLotQAwElUa7NMcoaRUQjdGnQBj?= =?us-ascii?Q?VxEjBUKnwLrFP0K+YS9BP33J5mbP38+MmCwk+fKDfiXOFBTxROXZu0SiexPl?= =?us-ascii?Q?27qxgciXmeqoQZ21c0BmylFXWwBlvJWOUf6Ipi+y3Sebbp48r1mVzc5vjbvJ?= =?us-ascii?Q?rQpsOp44mBEx1D85MfZcq+wN9FuoqfxtoCaqRIHY4JAB+T43j5XplO9xMfaI?= =?us-ascii?Q?MUqw9h39Vrr22PJsXwXRzURzSbh9uuuLNIWPiEGy5imNhsqVwiBwJ/Fu+Qg/?= =?us-ascii?Q?/W6Xs7KgxucEpVJb5HtRELFMxmgGdyilzH9wUc7lIWByqahenyN5a9pZyCiI?= =?us-ascii?Q?/F+u8UTWF7e/g+5Foot6VJ2OnFTiEBXkQx969qnsYiJRkF/m9YNtd8/Z38y0?= =?us-ascii?Q?GBGxx6UZrah4BoFVuvQMhVTshk9MIUrgL8wJJW+4KaprMegtK+GbYsc6PrHg?= =?us-ascii?Q?5diV7cwA0TSmzheNTqYe+jgJoleAGJo45bukxLQ0HXQigK0PN0K1GSFMvOvN?= =?us-ascii?Q?HB+zfKWM4cUCDmp5FYmCKZ3DKeu0UFlA7tH7QRtL6o8d0Sio1a23dEzyVQfR?= =?us-ascii?Q?l6kIXJpVey5RNueHI8nntGebcqllM+omziC3gWsvMVWoslhaHJ8Lfvx4FmyH?= =?us-ascii?Q?eomGvODG6g9o+Kb1wc2UlspuVSJ9tnUz3Fo7xI9KtbKF+Y9mgkUcMmGryq62?= =?us-ascii?Q?1x1bNyq9329ljYmUW9Ax9JA4QSDU4TpL2LDEK3XFL+iul8HRyOmlsFq9KuJf?= =?us-ascii?Q?eHJyyGopbQayLlY0Yb6byABQEAj35azqbBtnufVNtk8HkI2aBrBiI/DyKDqL?= =?us-ascii?Q?Z3oCedcj1zvgRYnDapgFOljicgC08AwxTWp4KCIi9TwMEKDrQKuvOLDUVcQE?= =?us-ascii?Q?A5HBkgmf0w=3D=3D?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: d37cc50b-970a-4089-6b9a-08de9f895348 X-MS-Exchange-CrossTenant-AuthSource: LV8PR12MB9620.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 21 Apr 2026 09:35:29.2235 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: tr25BSx6eukkwh6K3s4AsCeLn+iK9NQwdg/O8vB0CLMRWQRz7gXjfShXhAR7Dp+gcOLLhZuUVCwF+Z8jnfaocw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM4PR12MB6472 On Tue, Apr 21, 2026 at 11:01:41AM +0200, Andrea Righi wrote: > Hi Prateek, > > On Mon, Apr 20, 2026 at 11:42:23PM +0200, Andrea Righi wrote: > ... > > > >> I still have one question: Can first SD_ASYM_CPUCAPACITY_FULL be set at > > > >> a SD_NUMA? > > > >> > > > >> We'll need to deal with overlapping domains then but seems like it could > > > >> be possible with weird cpusets :-( > > > >> > > > >> But in that case, do we even want to search CPUs outside the NUMA in > > > >> select_idle_capacity()? I don't think anything stops this currently but > > > >> I might be wrong. > > > > > > > > My $0.02 on this. > > > > > > > > In theory it could happen with unusual topologies or constrained cpusets, > > > > although it should be quite rare. That said, select_idle_capacity() already > > > > operates on the span of sd_asym_cpucapacity, so if that domain crosses NUMA > > > > boundaries, we're already scanning across NUMA today. This patch doesn't > > > > fundamentally alter this behavior. > > > > > > Ack! I was just thinking loud from the topology standpoint since > > > sd->shared is not designed to handle the overlapping domains like > > > sg->sgc does but we can probably figure some way to make it work. > > > > > > Using the ring topology example from topology.c: > > > > > > 0 ----- 1 > > > | | > > > | | > > > | | > > > 3 ----- 2 > > > > > > Consider NUMA-1 below gets the SD_ASYM_CPUCAPACITY_FULL flag: > > > > > > NUMA-2 0-3 0-3 0-3 0-3 > > > groups: {0-1,3},{1-3} {0-2},{0,2-3} {1-3},{0-1,3} {0,2-3},{0-2} > > > > > > NUMA-1 0-1,3 0-2 1-3 0,2-3 > > > groups: {0},{1},{3} {0},{1},{2} {1},{2},{3} {0},{2},{3} > > > > > > NUMA-0 0 1 2 3 > > > > > > > > > The "sd->shared" assignments at NUMA-1 will put first, second, and the > > > last domain in the same "shared" range by today's logic since the first > > > CPU in their span is the same although their spans are slightly > > > different. > > > > > > The third will be standalone since the first CPU of the domain span > > > will be different. > > > > Yeah, makes sense. I'm wondering if we should attach the shared blob to > > sd_asym_cpucapacity only when asym is a non-overlapping domain, otherwise > > fallback to sd_llc and, in this case, ignore has_idle_cores in > > select_idle_capacity(). This might be not the best in terms of efficiency on > > those exotic topologies, but it'd eliminate the overlap/aliasing risk, while > > still being correct. What do you think? > > I slightly changed your patch adding this logic on top, I'll send an updated > patch series, so it's easier to review/comment. Actually... while preparing the series I realized that in select_idle_capacity() we may end up clearing the has_idle_cores hint even when the failure is due to affinity constraints (no fit CPU in the allowed cpumask), not only when no fully idle core is found in the system and this can lead to false has_idle_cores hints. At this point I'm wondering if it's better to just ignore the has_idle_cores hint completely in the smt+asym-cpu-capacity scenario (which would also simplify the exotic topology cases). I did some quick tests with this on Vera and I'm getting pretty much the same performance results. Opinions? Am I missing something? Thanks, -Andrea