From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from BL0PR03CU003.outbound.protection.outlook.com (mail-eastusazon11012038.outbound.protection.outlook.com [52.101.53.38]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A6A5930F932 for ; Sat, 27 Jun 2026 19:07:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=52.101.53.38 ARC-Seal:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782587259; cv=fail; b=Z6GGnJOawgD8iJ1RDt62xPM2kYzYo9f9JecDP0axiE7Ppx6T0w+1fyHI+fs9dUcWYDBeP9vRe3H9AgNXQNXotXmuXEwb/k+8MdM3MJvnHzZHSpilBsfmsOfgI+vdaie99sxxRhvpVcMTnKzfFHfrBmycLDDq0sN7PSFQJlEbj6k= ARC-Message-Signature:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782587259; c=relaxed/simple; bh=9qimqdGrnKfZtE6eXEobv3E8GZXyZ+nFGuMKOHPjA2g=; h=Date:From:To:Cc:Subject:Message-ID:References:Content-Type: Content-Disposition:In-Reply-To:MIME-Version; b=bj8/qn6KAqGGTCDIkeV55M6Yh/LCvq8HcoyfTcQLam4M46nX3HmHxi95b0HiDnolCXgbui86V1cYFQfaX4ttzlVrUuKNMTZnN9F09qTo8wYvKO+Ix8iJV5/zCXHwiYSspRIfboCmEwy2nNzVocKkcKZb5YxzF0UdXJeBqBFsz0Y= ARC-Authentication-Results:i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=VcXKXCDI; arc=fail smtp.client-ip=52.101.53.38 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="VcXKXCDI" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=YDiKe3v7U21tk6hCRBRkNHl3eS+iAZIKFVN1vLfLV4Hiy+fhnbuFVc03w4ccfet771jQyjFrzrEIu29+aRNSjK+NhHjISJ1h7F7jwcsjSjw/QyPqmoT5l/gXkjOKe3wlyBBfZgwlBKXsnbuPs18BqltdpsB15lI5GbfxH7X6IWXJyos80AgpHQ2uRgV8lDwciCs0lD3AuyjZnXMln14V+bdVMtaLEZ/x3/sf5+5Xi4SHwVCe1QgunqRjrEVDKE1BhOH0MdpstjFs76/fpmamb+Ivw8eqFCEtxBrPXRUaa19Su92841iNWaLQ8qv5xS7NLd4Vf2faSUiRIMQngq+/WQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=/b18ymKiM35TNk5TNBUhcpffZzezR3rFaN7oqg8VvKA=; b=xxrOBTpNuHMvF2iecEOQ+T/ifBYk/y1nQfnFS8Vg2I72IB5bGN1+zBPmcEL7Idv+NRAOaFMvi+Cxq9QO1akF/A42lUfYgvJVnEqyaFzoKztwhF5WfShpKdnQ1kfTs1SpqZ2XvQ2aec3FAxq0wks2AStWG30fDocpFiZ1hNoTieRDeL4/fGMtLR9795Mwjc2G16GVn7raMWNHmNr2jF3F3bm2XzidCZBEhft4fm/u0HEWB0alOA+gIFXM28KGs1ncQDpDedNHNcY7c6toa8BTXvd/rkFy2DtrwLRSpghchv42rZ9ri52NOWS1wj62ysKfjhkgv38UAM1yPj/vXHBy9A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=/b18ymKiM35TNk5TNBUhcpffZzezR3rFaN7oqg8VvKA=; b=VcXKXCDIrOTJIYHgDscEo0dU5nPe3p+GNjhJaNjLDRv34kIBO61VmQ55tTQceouTI3o3m+bF8QdypkopMSEEYErlCGH0EYcqwnB7/QkipQqQOWzSjOcVd3g/Pw3I7R4oRzyvHpXs/tyLlsT6AB/wvrFlegHE+COPoSNdrhqL5d217Q6bnyuWjDSPPDbC4O49bF/NGzdzU3L+tXqn6jmpvBvWH0ssGdVmeU66ycF0R2c24obEUZ6DMhacApnCP0/3ZhEm0jPpFnSaZGo66fmkBxGTOrxftj4fAHLXrou3QjhT4fbO+zhJNgLlv6J7oIu2+BT7E/OvR206kH6CvODCcg== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DM6PR12MB4827.namprd12.prod.outlook.com (2603:10b6:5:1d6::14) by SA0PR12MB7076.namprd12.prod.outlook.com (2603:10b6:806:2d5::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.21.159.18; Sat, 27 Jun 2026 19:07:34 +0000 Received: from DM6PR12MB4827.namprd12.prod.outlook.com ([fe80::6261:3040:864b:159c]) by DM6PR12MB4827.namprd12.prod.outlook.com ([fe80::6261:3040:864b:159c%3]) with mapi id 15.21.0139.018; Sat, 27 Jun 2026 19:07:33 +0000 Date: Sat, 27 Jun 2026 21:07:19 +0200 From: Andrea Righi To: Ricardo Neri Cc: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Tim C Chen , Chen Yu , Christian Loehle , K Prateek Nayak , Barry Song , "Rafael J. Wysocki" , Len Brown , ricardo.neri@intel.com, linux-kernel@vger.kernel.org Subject: Re: [PATCH v5 5/6] sched/fair: Allow load balancing between CPUs of identical capacity Message-ID: References: <20260622-rneri-fix-cas-clusters-v5-0-19968f2d1497@linux.intel.com> <20260622-rneri-fix-cas-clusters-v5-5-19968f2d1497@linux.intel.com> Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20260622-rneri-fix-cas-clusters-v5-5-19968f2d1497@linux.intel.com> X-ClientProxiedBy: MI3PEPF00007541.ITAP293.PROD.OUTLOOK.COM (2603:10a6:298:1::4d3) To DM6PR12MB4827.namprd12.prod.outlook.com (2603:10b6:5:1d6::14) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DM6PR12MB4827:EE_|SA0PR12MB7076:EE_ X-MS-Office365-Filtering-Correlation-Id: dcb86b37-d28a-4740-3cbd-08ded47f57e6 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|366016|1800799024|23010399003|376014|7416014|4143699003|11063799006|22082099003|18002099003|56012099006; X-Microsoft-Antispam-Message-Info: t6P1Rx/VyhhwnJ7fhkZW3q/E7M6pYJkoI4zsw/r2WJ9sGFEVtE8oxiv3E2xmb7Bd2FzjxopzfVcc7ZgcP8WSn5UQxxXtDCB+CJ6ZVwncCJX7XCn9siK7HKMMhC77kfmXzf19zCi/x3XfZhSolGDqzMqTP47wWkGhshMPTrM0a/pA4Nlnh35pGc06ONg7/jYo3lDmU9VUjipNaI67dDTvMahe0XM835tipkoW4fb2vT0TnOSW97kqS5shO2si1Hztik2ZsqXncbAIfKyh4/58fWw+vJgZMxratGk4eRXIEb0SPj8SJWH3dORamRXRRR7ISX9HUv8IJTiLRg4eRxF+7Oi3PO+I3ZdwdNgYX8B31IvIdaLS6dQ0Bn/92p8vEBE1a9o7GKU0a4nXMlnNN79IScaSxKbC82hjFM6pvdvAlJUSZKyvSutAMHigILA+fR/9puasrFrsPh0r/4aV/JSi+wo3uZVxP7+hqnSOc6LqAqDMavziydvIcYOCaI3nUqJhxZlXFTuQg/vMZbFS54PkLh+E3gUB5vWplcqHoVdiXg0+BTXpEiGTs1KE3BGCU2VjSisr9t4w6aonMBBMCLwJ1plOhiySGuBs5rNwzpuypUb+PFVwK4oPLyA+J79Kw+mo9ZUJoUxKRpV0E534+a3MVqLmIgY89Aq+Ch6l6qbUSJs= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DM6PR12MB4827.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(366016)(1800799024)(23010399003)(376014)(7416014)(4143699003)(11063799006)(22082099003)(18002099003)(56012099006);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?bTYgt4yKd3Jx9Gni2ZVMrOfp/fRBrl7qSEGYe04J/oLVLeRqP5O1FOjKW7bc?= =?us-ascii?Q?b/2v4NxXVsf5Mcmgc/NJpQXKz3YN7G2holG3ej0NRjVVY3JO5uUp9HqewpqY?= =?us-ascii?Q?D0ydXSr26h5gaa2+etsj6iTQSq+BSsP1RLbx48uVkizVHValLBWbjSnSvKdC?= =?us-ascii?Q?XDzaPWacZCFdvCRxElYypNqfL44biIqCQmc1am/+d8olV8aZkhTtI/BWukBb?= =?us-ascii?Q?Wj9jEpdCUYFfz/X92cnlSHqI/6WHNGli34lvmIebNxJNThSj5BihM/s+Ld3m?= =?us-ascii?Q?ZMlkVhus6ev7rCAMS8VGj1DGOut1tDCtpOyhnuU96g1bZ3TOqVLft4gS7YP1?= =?us-ascii?Q?X2fnVSxAolqOHH/gz0NiPJKEr3a4L2yYrlwd2mvsaX48WaDHqRoVskdEigEE?= =?us-ascii?Q?DPClgqWChtETAqQ48CrbmvgKV+QBYlgzgZN7ecXDHCOTjOw4gFwNczg9q2n+?= =?us-ascii?Q?j+Vq8COtPb5oWTe9SRjEnZcr1jjH4OIXP8IbjdlyGAXwdmYIcH2SPozkjpUA?= =?us-ascii?Q?6CGrU1zYq2LBmcj01fj3qVgfzGB6BqWktz47p1QFJio+3KmI0N57DvQa1l24?= =?us-ascii?Q?1hvVQ45uk+1zRRCUFJzWyFj/yA0iKxtt6UsXi8pWq2Dua4JYNWAF2BMJWS4y?= =?us-ascii?Q?0mwe5DkZP14VwasWZ1B3ukXXu/YRvAzOsiieM2Mbr4fiBl51JCe+7v3UrVVq?= =?us-ascii?Q?L7kirAiEoxX4VRDHtP9u+FARNO8DI+Sd5VxkGxPurg9CTi5tZe4Z1hwi9fTs?= =?us-ascii?Q?jjt78tvSjhpc1ccR349gOW69FXad0X69+uvsM1JhX9+bEjY3pRt3QHvqIB1p?= =?us-ascii?Q?VCEdjx4qaAEqpV4W0pxsPxcTK3yJt/FhaY6bOn/mYcZfM+PRELd1pXTpOtKt?= =?us-ascii?Q?SLopCgwWyHSZquxH27DMALhsaWcvpFda5vOAYczQzfDkfYDYw34Dgg9AMExn?= =?us-ascii?Q?FE+vJ6j4BIZPGkXKux0v4T97c7oVcFQpNaEAT4eRKfELHqlPwUcOPMgaXkc+?= =?us-ascii?Q?bY1kN3EEABSAnyjW+yYyK2PaYD8Q+IylO898BQgE5bscLHYNYvMwbc/4fKXZ?= =?us-ascii?Q?7TaL/CWHfxeY4MzGiQvcKa3zbyw2hdUu46OL8kP0am1rB6MdT3VuK4K9MIDu?= =?us-ascii?Q?g7w813XV/e8jTZ+Kwl3ohySeOmd7BCnMlMk/xwEA6dSNDf+7h4BXm/FcieUy?= =?us-ascii?Q?pV0lWN4JcBlWp+n5vdEHm4tILEyjXiJh1/3IRmQ61P5NARzvvHAB0ONbF78f?= =?us-ascii?Q?XhCA/1OKLuTfGoLsOieR3tIb+04XDjl2X2/8e6mJyQUGZAmd8ZWJTz4yKpEc?= =?us-ascii?Q?KN5Oeas5BGHmYA6ZwEg5fWauIDxKtfpMXAQnjK02PYuc1BRZLQmMT/skHh5+?= =?us-ascii?Q?+u2c5C8e7d0DtV9k3DNdMp8reU2/CoHmVzUjZQYt6frEBPDg7K+jx3sPq8sa?= =?us-ascii?Q?+ctMYh89Jq6CqVEzYXF5nGmkxwbfe9ZvX/IH10KuoF9aUnwdbdubcZjvzH0+?= =?us-ascii?Q?R05vuPOTAAdqhIvtHMl/OkRCKtV8vXE7T4XdxFrmUNuA7G+RUDE/Anzvfsql?= =?us-ascii?Q?z4/3kGTm5TAeswc9Cht5kWIPhE3ADaINKoGGRXtn9kw1fRQOaSWSI4Mu3+k6?= =?us-ascii?Q?/YuIC/A72Ub81NpO5OEVxf9CKy79m0WONvZQ0YwK0cTxWrkYDQXsTsDpouGa?= =?us-ascii?Q?LYAdYMcllgH6BrDod765sfFlyZuA7xPIEAcc4pz8lts1EMM4IGo44bxxvccU?= =?us-ascii?Q?FJCYcBcdpQ=3D=3D?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: dcb86b37-d28a-4740-3cbd-08ded47f57e6 X-MS-Exchange-CrossTenant-AuthSource: DM6PR12MB4827.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 27 Jun 2026 19:07:33.5368 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 2uYoesj0HydamIb30/YjPU2+2zChGov0Y6L1UGdEkCJvXSeeZiFu6vn+2seAXKPn2LfDycCEA0bjI3OB2d0ccg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA0PR12MB7076 Hi Ricardo, On Mon, Jun 22, 2026 at 05:05:55PM -0700, Ricardo Neri wrote: > sched_balance_find_src_rq() avoids selecting a runqueue with a single > running task as busiest if doing so results in migrating the task to a > CPU with less than ~5% of extra capacity. It also unintentionally > prevents migrations between CPUs of identical capacity. > > When CONFIG_SCHED_CLUSTER is enabled, load should be balanced across > clusters of CPUs with the same capacity. Allowing migration between CPUs > of identical capacity is necessary to meet this goal. > > Use arch_scale_cpu_capacity() to reflect architectural capacity, excluding > runtime reductions due to side activity or thermal pressure. Guard this > check with the sched_cluster_active static key so that systems without > cluster topology are unaffected. > > Tested-by: Christian Loehle > Signed-off-by: Ricardo Neri > --- > Changes in v5: > * Optimized logic to identify same-arch clusters only when needed. > * Added Tested-by tag from Christian. Thanks! > > Changes in v4: > * Implemented the check for cluster with a local variable for improved > readability. > > Changes in v3: > * Reverted the inverted capacity check; the inverted form incorrectly > allows migrations to CPUs of slightly less capacity. > * Guarded the check for architectural capacity with the > sched_cluster_active static key. > > Changes in v2: > * Used arch_scale_cpu_capacity() instead of capacity_of() to ignore > runtime variability. > * Inverted the check for runtime capacity. (Christian) > * Reworded patch description for clarity. > --- > kernel/sched/fair.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index e55eb019d2c9..f4eb55cad54d 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -12992,13 +12992,20 @@ static struct rq *sched_balance_find_src_rq(struct lb_env *env, > */ > if (env->sd->flags & SD_ASYM_CPUCAPACITY && > nr_running == 1) { > + bool same_arch_cluster = static_branch_unlikely(&sched_cluster_active) && > + (arch_scale_cpu_capacity(env->dst_cpu) == > + arch_scale_cpu_capacity(i)); I find same_arch_cluster a bit misleading. It sounds like "these two CPUs belong to the same cluster", while what it actually checks is whether a cluster topology exists somewhere in the root domain and the two CPUs have exactly the same architectural capacity. Am I understanding it correctly? If so, would something like same_arch_capacity or cluster_equal_capacity be a better name? I think either would make the intent of the code a bit clearer. Thanks, -Andrea > bool smt_degraded_cap = sched_smt_active() && !is_core_idle(i); > > /* > * Busy SMT siblings reduce the capacity of CPU @i. Do > * not skip it in this case. > + * > + * CONFIG_SCHED_CLUSTER requires balancing load across clusters > + * of identical capacity. Use architectural capacity to ignore > + * runtime variability. > */ > - if (!smt_degraded_cap && > + if (!smt_degraded_cap && !same_arch_cluster && > !capacity_greater(capacity_of(env->dst_cpu), capacity)) > continue; > } > > -- > 2.43.0 >