From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id F37FB27E054 for ; Mon, 2 Feb 2026 10:48:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770029291; cv=none; b=hYjYgrJyoLj79Q0c7eNz8VcOESo0ObAG1RQn4esYgqVdYsAh0WMatziqLMsSV58gAJ8PEN+w5Y7NSqKCPEaC1memtPJOo/mJpLsEBmJ8y0OiKB5kREhdtv5Io8VR4I4OA7zGkdh4o3COg36dVcaVfg7m9YLzxeuc31J1kUZzo84= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770029291; c=relaxed/simple; bh=LiZ6IgVjHlJjasW4xdlKeAg38wGXg18r72PWF5QPdqc=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=RaD4kXsKHKX95jcFNYL27vQc5hfXxNVIAmGmT0CIHCHZEXQSBiG7s3Q1b6dKoiDccPH6UFoBewUYFtE7Rxvih14A5nAZpW54k0pyR4t6kiAMk81sxvjQO7VZoPu1N3HDPKYyqjYFrvsK33vWc/13Jh7FQHXbMTGZmRwm6I0VwAQ= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id EFD9E339; Mon, 2 Feb 2026 02:48:01 -0800 (PST) Received: from [10.1.35.46] (unknown [10.1.35.46]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 7232E3F740; Mon, 2 Feb 2026 02:48:06 -0800 (PST) Message-ID: Date: Mon, 2 Feb 2026 10:48:04 +0000 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2 RSEND] sched/fair: Optimize EAS energy calculation complexity from O(N) to O(1) inside inner loop To: Qiliang Yuan , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot Cc: Qiliang Yuan , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , linux-kernel@vger.kernel.org References: <20260202030512.2792311-1-realwujing@gmail.com> Content-Language: en-US From: Christian Loehle In-Reply-To: <20260202030512.2792311-1-realwujing@gmail.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit On 2/2/26 03:05, Qiliang Yuan wrote: > Pre-calculate the base maximum utilization of each performance domain during the > main loop of find_energy_efficient_cpu() and cache it in the local > 'energy_env' structure. > > By caching this base value, the maximum utilization for candidate CPU > placements (such as prev_cpu and max_spare_cap_cpu) can be determined in > O(1) time, eliminating redundant scans of the performance domain. This > optimizes the energy estimation path by reducing the number of scans per > performance domain from three to one. Which is still O(n), I think the title is misleading. > > This change significantly reduces wake-up latency on systems with high core > counts or complex performance domain topologies by minimizing the overall > complexity of the Energy-Aware Scheduling (EAS) calculation. I don't think this is actually true. EAS doesn't really work with a large number of PDs because of the expensive wakeup path. I don't think there's an EAS system out there where this would actually make a measurable impact. Most have 2 or 3, the highest number of PDs I'm aware of is 5, but FWIW the actual change looks correct to me. > > Signed-off-by: Qiliang Yuan > Signed-off-by: Qiliang Yuan > --- > v2: > - Ensure RCU safety by using local 'energy_env' for caching instead of > modifying the shared 'perf_domain' structure. > - Consolidate pre-calculation into the main loop to avoid an extra pass > over the performance domains. > v1: > - Optimize energy calculation by pre-calculating performance domain max utilization. > - Add max_util and max_spare_cap_cpu to struct perf_domain. > - Reduce inner loop complexity from O(N) to O(1) for energy estimation. > > kernel/sched/fair.c | 36 ++++++++++++++++++------------------ > 1 file changed, 18 insertions(+), 18 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index e71302282671..5c114c49c202 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -8148,6 +8148,7 @@ struct energy_env { > unsigned long pd_busy_time; > unsigned long cpu_cap; > unsigned long pd_cap; > + unsigned long pd_max_util; > }; > > /* > @@ -8215,41 +8216,32 @@ static inline void eenv_pd_busy_time(struct energy_env *eenv, > * exceed @eenv->cpu_cap. > */ > static inline unsigned long > -eenv_pd_max_util(struct energy_env *eenv, struct cpumask *pd_cpus, > +eenv_pd_max_util(struct energy_env *eenv, struct perf_domain *pd, > struct task_struct *p, int dst_cpu) > { > - unsigned long max_util = 0; > - int cpu; > + unsigned long max_util = eenv->pd_max_util; > > - for_each_cpu(cpu, pd_cpus) { > - struct task_struct *tsk = (cpu == dst_cpu) ? p : NULL; > - unsigned long util = cpu_util(cpu, p, dst_cpu, 1); > + if (dst_cpu >= 0 && cpumask_test_cpu(dst_cpu, perf_domain_span(pd))) { > + unsigned long util = cpu_util(dst_cpu, p, dst_cpu, 1); > unsigned long eff_util, min, max; > > - /* > - * Performance domain frequency: utilization clamping > - * must be considered since it affects the selection > - * of the performance domain frequency. > - * NOTE: in case RT tasks are running, by default the min > - * utilization can be max OPP. > - */ > - eff_util = effective_cpu_util(cpu, util, &min, &max); > + eff_util = effective_cpu_util(dst_cpu, util, &min, &max); > > /* Task's uclamp can modify min and max value */ > - if (tsk && uclamp_is_used()) { > + if (uclamp_is_used()) { > min = max(min, uclamp_eff_value(p, UCLAMP_MIN)); > > /* > * If there is no active max uclamp constraint, > * directly use task's one, otherwise keep max. > */ > - if (uclamp_rq_is_idle(cpu_rq(cpu))) > + if (uclamp_rq_is_idle(cpu_rq(dst_cpu))) > max = uclamp_eff_value(p, UCLAMP_MAX); > else > max = max(max, uclamp_eff_value(p, UCLAMP_MAX)); > } > > - eff_util = sugov_effective_cpu_perf(cpu, eff_util, min, max); > + eff_util = sugov_effective_cpu_perf(dst_cpu, eff_util, min, max); > max_util = max(max_util, eff_util); > } > > @@ -8265,7 +8257,7 @@ static inline unsigned long > compute_energy(struct energy_env *eenv, struct perf_domain *pd, > struct cpumask *pd_cpus, struct task_struct *p, int dst_cpu) > { > - unsigned long max_util = eenv_pd_max_util(eenv, pd_cpus, p, dst_cpu); > + unsigned long max_util = eenv_pd_max_util(eenv, pd, p, dst_cpu); > unsigned long busy_time = eenv->pd_busy_time; > unsigned long energy; > > @@ -8376,12 +8368,20 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu) > > eenv.cpu_cap = cpu_actual_cap; > eenv.pd_cap = 0; > + eenv.pd_max_util = 0; > > for_each_cpu(cpu, cpus) { > struct rq *rq = cpu_rq(cpu); > + unsigned long util_b, eff_util_b, min_b, max_b; > > eenv.pd_cap += cpu_actual_cap; > > + /* Pre-calculate base max utilization for the performance domain */ > + util_b = cpu_util(cpu, p, -1, 1); > + eff_util_b = effective_cpu_util(cpu, util_b, &min_b, &max_b); > + eff_util_b = sugov_effective_cpu_perf(cpu, eff_util_b, min_b, max_b); > + eenv.pd_max_util = max(eenv.pd_max_util, eff_util_b); > + > if (!cpumask_test_cpu(cpu, sched_domain_span(sd))) > continue; >