From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.16])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 957323DEAD0
	for <linux-kernel@vger.kernel.org>; Fri,  8 May 2026 12:43:59 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.16
ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1778244241; cv=none; b=uIAIkODBux22WWZfJwWgXZbyMNo/Burhe7P5Af1Y0U72IbyG1AFGWMAbNv5vrcyo6SGQ6R0w23K5/cOU7Y4OkLeGiXo6XbEBkfmD55IbPAn5oNr5zP+MOsiBVbAHeOMal6QKEtmQ3PLuJcH4moZoDtnT++KHzD7edPJeHwfu79U=
ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1778244241; c=relaxed/simple;
	bh=vPg96erC/MMKmV1ymfyMaT12kIMNIb8HaH0s0Xz7OHI=;
	h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version:
	 Content-Type:Content-Disposition:In-Reply-To; b=TcJb0Twn5OtkMemLFDCBpZy5vhgMOV7Mm/oQ1pR+ZnWcNd+1nom7CPuA6/Su2EBrREFGWiwmOAqFjLHY7Oo/M1bDgBoN4o2aJ0gJEPT/Ec/NYo8aaFDmpIg8hRc5rNVC5dmT3E+hVQM7RP5iarhZFYHqDZapyFtjY+9/KB5TOQM=
ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=pass smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=BRJhC+L6; arc=none smtp.client-ip=192.198.163.16
Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com
Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.intel.com
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="BRJhC+L6"
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;
  d=intel.com; i=@intel.com; q=dns/txt; s=Intel;
  t=1778244240; x=1809780240;
  h=date:from:to:cc:subject:message-id:references:
   mime-version:in-reply-to;
  bh=vPg96erC/MMKmV1ymfyMaT12kIMNIb8HaH0s0Xz7OHI=;
  b=BRJhC+L6cDdOtgkLsJu7CErweNgw3YXdq8JYs2TC4mdR1jRBwT04gsK5
   vweTr1Z0sdoPqlO0yqPcn4XJ/YNKib2mLsUy9G8Us9B1NuzpVUwDt5lBH
   5BaAId6cvamgK0FWyCgSjrFQN+PtVfSCm+Lq0qC8Mx3kn/E05bJa+GcD+
   dqssaOkoNNiDvNS+rNTli4LsHI4Huh8HTzJtJsyJU+2TF8/294DB9RxEu
   NU0gsi1FOQC+5R66udczASm81s8lktCkLV2ER4M23NdMDNDkh621We5c8
   iwuG55APX4Yohx3Kl8qRpAPJEzqauVbE6U8oJQ77AVNF7Qtjw3Z00Gn69
   Q==;
X-CSE-ConnectionGUID: MkJdaoTRSc2WsVWGSr8GuA==
X-CSE-MsgGUID: aHAM4JXnSJa2Zv6EPwhfsA==
X-IronPort-AV: E=McAfee;i="6800,10657,11779"; a="66737801"
X-IronPort-AV: E=Sophos;i="6.23,223,1770624000"; 
   d="scan'208";a="66737801"
Received: from orviesa009.jf.intel.com ([10.64.159.149])
  by fmvoesa110.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 May 2026 05:43:59 -0700
X-CSE-ConnectionGUID: afELIep1T5GF9lnZ3s3YEg==
X-CSE-MsgGUID: 598wlwS1Rka0X5zIgqYNQA==
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="6.23,223,1770624000"; 
   d="scan'208";a="236880723"
Received: from ranerica-svr.sc.intel.com ([172.25.110.23])
  by orviesa009.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 May 2026 05:43:51 -0700
Date: Fri, 8 May 2026 05:52:32 -0700
From: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
To: Christian Loehle <christian.loehle@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>, Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Valentin Schneider <vschneid@redhat.com>,
	Tim C Chen <tim.c.chen@linux.intel.com>,
	Chen Yu <yu.c.chen@intel.com>, Barry Song <baohua@kernel.org>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Len Brown <lenb@kernel.org>, ricardo.neri@intel.com,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 3/4] sched/fair: Allow load balancing between CPUs of
 identical capacity
Message-ID: <20260508125232.GA6459@ranerica-svr.sc.intel.com>
References: <20260429-rneri-fix-cas-clusters-v2-0-cd787de35cc6@linux.intel.com>
 <20260429-rneri-fix-cas-clusters-v2-3-cd787de35cc6@linux.intel.com>
 <0171d56e-c8b3-4a17-85a5-93ac407aae5f@arm.com>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <0171d56e-c8b3-4a17-85a5-93ac407aae5f@arm.com>
User-Agent: Mutt/1.9.4 (2018-02-28)

On Wed, May 06, 2026 at 02:10:22PM +0100, Christian Loehle wrote:
> On 4/29/26 22:19, Ricardo Neri wrote:
> > sched_balance_find_src_rq() avoids selecting a runqueue with a single
> > running task as busiest if doing so results in migrating the task to a
> > CPU with less than ~5% of extra capacity. It also unintentionally
> > prevents migrations between CPUs of identical capacity.
> > 
> > When CONFIG_SCHED_CLUSTER is enabled, load should be balanced across
> > clusters of CPUs with the same capacity. Allowing migration between CPUs
> > of identical capacity is necessary to meet this goal.
> > 
> > We are interested in the architectural capacity of the involved CPUs,
> > excluding any reductions due to side activity or thermal pressure. Use
> > arch_scale_cpu_capacity().
> > 
> > While here, invert the check for runtime capacity for clarity.
> > 
> > Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
> > ---
> > Changes since v1:
> >  * Used arch_scale_cpu_capacity() instead of capacity_of() to ignore
> >    runtime variability.
> >  * Inverted the check for runtime capacity. (Christian)
> >  * Reworded patch description for clarity.
> > ---
> >  kernel/sched/fair.c | 7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 166a5b109e0e..4105717e64fe 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -11816,9 +11816,14 @@ static struct rq *sched_balance_find_src_rq(struct lb_env *env,
> >  		 * eventually lead to active_balancing high->low capacity.
> >  		 * Higher per-CPU capacity is considered better than balancing
> >  		 * average load.
> > +		 *
> > +		 * Cluster scheduling requires balancing load across clusters
> > +		 * of identical capacity. Use architectural capacity to ignore
> > +		 * runtime variability.
> >  		 */
> >  		if (env->sd->flags & SD_ASYM_CPUCAPACITY &&
> > -		    !capacity_greater(capacity_of(env->dst_cpu), capacity) &&
> > +		    arch_scale_cpu_capacity(env->dst_cpu) != arch_scale_cpu_capacity(i) &&
> > +		    capacity_greater(capacity, capacity_of(env->dst_cpu)) &&
> >  		    nr_running == 1)
> >  			continue;
> >  
> > 
> 
> I wonder if we shouldn't use capacity_greater() margin for both, i.e.
> capacity_greater(arch_scale_cpu_capacity(i), arch_scale_cpu_capacity(env->dst_cpu)) &&
> 
> For example the orion o6 has a cluster with 1024 and one with 984, If we allow balancing
> 984->984 I think it's only consistent to also allow 984->1024.

But that would be a change in the current policy, no? Today we allow a 984->
1024 balance based on runtime capacity. The scope of this patchset is to make
SCHED_CLUSTER work as expected for clusters of same capacity.

Perhaps your proposal of using architectural capacity can be evaluated in a
separate patchset?

By the way, in v3 I will have to undo the inversion of the runtime capacity.
The original check allowed balance if dst_cpu had at least 5% more capacity
than src_cpu. The inverted check allows balance to CPUs of less capacity if
the difference is less than 5%.