From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 957323DEAD0 for ; Fri, 8 May 2026 12:43:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.16 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778244241; cv=none; b=uIAIkODBux22WWZfJwWgXZbyMNo/Burhe7P5Af1Y0U72IbyG1AFGWMAbNv5vrcyo6SGQ6R0w23K5/cOU7Y4OkLeGiXo6XbEBkfmD55IbPAn5oNr5zP+MOsiBVbAHeOMal6QKEtmQ3PLuJcH4moZoDtnT++KHzD7edPJeHwfu79U= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778244241; c=relaxed/simple; bh=vPg96erC/MMKmV1ymfyMaT12kIMNIb8HaH0s0Xz7OHI=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=TcJb0Twn5OtkMemLFDCBpZy5vhgMOV7Mm/oQ1pR+ZnWcNd+1nom7CPuA6/Su2EBrREFGWiwmOAqFjLHY7Oo/M1bDgBoN4o2aJ0gJEPT/Ec/NYo8aaFDmpIg8hRc5rNVC5dmT3E+hVQM7RP5iarhZFYHqDZapyFtjY+9/KB5TOQM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=pass smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=BRJhC+L6; arc=none smtp.client-ip=192.198.163.16 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="BRJhC+L6" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1778244240; x=1809780240; h=date:from:to:cc:subject:message-id:references: mime-version:in-reply-to; bh=vPg96erC/MMKmV1ymfyMaT12kIMNIb8HaH0s0Xz7OHI=; b=BRJhC+L6cDdOtgkLsJu7CErweNgw3YXdq8JYs2TC4mdR1jRBwT04gsK5 vweTr1Z0sdoPqlO0yqPcn4XJ/YNKib2mLsUy9G8Us9B1NuzpVUwDt5lBH 5BaAId6cvamgK0FWyCgSjrFQN+PtVfSCm+Lq0qC8Mx3kn/E05bJa+GcD+ dqssaOkoNNiDvNS+rNTli4LsHI4Huh8HTzJtJsyJU+2TF8/294DB9RxEu NU0gsi1FOQC+5R66udczASm81s8lktCkLV2ER4M23NdMDNDkh621We5c8 iwuG55APX4Yohx3Kl8qRpAPJEzqauVbE6U8oJQ77AVNF7Qtjw3Z00Gn69 Q==; X-CSE-ConnectionGUID: MkJdaoTRSc2WsVWGSr8GuA== X-CSE-MsgGUID: aHAM4JXnSJa2Zv6EPwhfsA== X-IronPort-AV: E=McAfee;i="6800,10657,11779"; a="66737801" X-IronPort-AV: E=Sophos;i="6.23,223,1770624000"; d="scan'208";a="66737801" Received: from orviesa009.jf.intel.com ([10.64.159.149]) by fmvoesa110.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 May 2026 05:43:59 -0700 X-CSE-ConnectionGUID: afELIep1T5GF9lnZ3s3YEg== X-CSE-MsgGUID: 598wlwS1Rka0X5zIgqYNQA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.23,223,1770624000"; d="scan'208";a="236880723" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by orviesa009.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 May 2026 05:43:51 -0700 Date: Fri, 8 May 2026 05:52:32 -0700 From: Ricardo Neri To: Christian Loehle Cc: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Tim C Chen , Chen Yu , Barry Song , "Rafael J. Wysocki" , Len Brown , ricardo.neri@intel.com, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2 3/4] sched/fair: Allow load balancing between CPUs of identical capacity Message-ID: <20260508125232.GA6459@ranerica-svr.sc.intel.com> References: <20260429-rneri-fix-cas-clusters-v2-0-cd787de35cc6@linux.intel.com> <20260429-rneri-fix-cas-clusters-v2-3-cd787de35cc6@linux.intel.com> <0171d56e-c8b3-4a17-85a5-93ac407aae5f@arm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <0171d56e-c8b3-4a17-85a5-93ac407aae5f@arm.com> User-Agent: Mutt/1.9.4 (2018-02-28) On Wed, May 06, 2026 at 02:10:22PM +0100, Christian Loehle wrote: > On 4/29/26 22:19, Ricardo Neri wrote: > > sched_balance_find_src_rq() avoids selecting a runqueue with a single > > running task as busiest if doing so results in migrating the task to a > > CPU with less than ~5% of extra capacity. It also unintentionally > > prevents migrations between CPUs of identical capacity. > > > > When CONFIG_SCHED_CLUSTER is enabled, load should be balanced across > > clusters of CPUs with the same capacity. Allowing migration between CPUs > > of identical capacity is necessary to meet this goal. > > > > We are interested in the architectural capacity of the involved CPUs, > > excluding any reductions due to side activity or thermal pressure. Use > > arch_scale_cpu_capacity(). > > > > While here, invert the check for runtime capacity for clarity. > > > > Signed-off-by: Ricardo Neri > > --- > > Changes since v1: > > * Used arch_scale_cpu_capacity() instead of capacity_of() to ignore > > runtime variability. > > * Inverted the check for runtime capacity. (Christian) > > * Reworded patch description for clarity. > > --- > > kernel/sched/fair.c | 7 ++++++- > > 1 file changed, 6 insertions(+), 1 deletion(-) > > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > index 166a5b109e0e..4105717e64fe 100644 > > --- a/kernel/sched/fair.c > > +++ b/kernel/sched/fair.c > > @@ -11816,9 +11816,14 @@ static struct rq *sched_balance_find_src_rq(struct lb_env *env, > > * eventually lead to active_balancing high->low capacity. > > * Higher per-CPU capacity is considered better than balancing > > * average load. > > + * > > + * Cluster scheduling requires balancing load across clusters > > + * of identical capacity. Use architectural capacity to ignore > > + * runtime variability. > > */ > > if (env->sd->flags & SD_ASYM_CPUCAPACITY && > > - !capacity_greater(capacity_of(env->dst_cpu), capacity) && > > + arch_scale_cpu_capacity(env->dst_cpu) != arch_scale_cpu_capacity(i) && > > + capacity_greater(capacity, capacity_of(env->dst_cpu)) && > > nr_running == 1) > > continue; > > > > > > I wonder if we shouldn't use capacity_greater() margin for both, i.e. > capacity_greater(arch_scale_cpu_capacity(i), arch_scale_cpu_capacity(env->dst_cpu)) && > > For example the orion o6 has a cluster with 1024 and one with 984, If we allow balancing > 984->984 I think it's only consistent to also allow 984->1024. But that would be a change in the current policy, no? Today we allow a 984-> 1024 balance based on runtime capacity. The scope of this patchset is to make SCHED_CLUSTER work as expected for clusters of same capacity. Perhaps your proposal of using architectural capacity can be evaluated in a separate patchset? By the way, in v3 I will have to undo the inversion of the runtime capacity. The original check allowed balance if dst_cpu had at least 5% more capacity than src_cpu. The inverted check allows balance to CPUs of less capacity if the difference is less than 5%.