From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BB78BC47DB3 for ; Tue, 30 Jan 2024 00:50:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=oskhlasL0q1nUxERxSe9CGiwGdeVxnRwOLvKqT3w6NU=; b=dcREiORLfnnbz7 zqFTQOuStThUOm+LXAo+W3iBkCgbWB3qY8dL2sDrjpAHkzAtwLUNK5q06+KctwY9h90mEXOgAdfjb nVHPtVIAnDR8fPbnydLxMZcSs7dQdzih9mkqi/hn9UHf8DKyGsOF9efFFOeBRD1O1CzrWVbzRQxor 1wmRzKy5OXP6vDrlsctpGubJ/tohXlCRQL2G5whYXEHL3KvjVdKsoxtQ5sipxTwwDfszmbfNr+POP yrYsKthuo/0cFSkpclZIAD5hqibz0Avri77OtjfY1ZoS/u2s+eirtwnQCb7mZo+5Dp7QjKAzmdMfr DU3HKc/qw78hB5m0iNLQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rUcKh-0000000EkuD-082L; Tue, 30 Jan 2024 00:50:35 +0000 Received: from mail-wm1-x32b.google.com ([2a00:1450:4864:20::32b]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1rUcKe-0000000Eks0-0N5W for linux-arm-kernel@lists.infradead.org; Tue, 30 Jan 2024 00:50:33 +0000 Received: by mail-wm1-x32b.google.com with SMTP id 5b1f17b1804b1-40f02b8d150so505785e9.2 for ; Mon, 29 Jan 2024 16:50:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=layalina-io.20230601.gappssmtp.com; s=20230601; t=1706575830; x=1707180630; darn=lists.infradead.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=Qs+y4O2B1OYSA4cr438d5JeMKWKUT/MLDmsJLyrN1SM=; b=LXNXeCOoy9pepvE90b+zXGBlnVjxJuX2BAvSO+VDyyw9QJI3C2Y4Cr4V2dhFVTwUOm /zkWfQnUi+Ek6LDtHRbUZYX7aO/72sLUp/NHHcPOirT25xQq3ZGSfye3PcFQJQBJw5HK J9BEqcPtKmGK8/eeWZfM8ebRQ7l6dGuZUlieIJ4L7JfghcKFead/LKNekGmV1HWJaOOn h7BAEpKNjG2LF1bw266jiNCJJwZqmkEj9eTg2ucx6axNo8C7pp2m9wMOgL2BukjiEGXg 8MyCaLrk35mWosv3mNTF/qm8C5LEsxHZ6MX3Qes2r5sXZoQ+3AwbcsI7+jVsqDb/5gBl Lj1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706575830; x=1707180630; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=Qs+y4O2B1OYSA4cr438d5JeMKWKUT/MLDmsJLyrN1SM=; b=ZoW0LItZv3lD1cvkwBN6u+z0hBpAdXE1yVzzGcmXtG9pc18tj/dmkFLd+IDChHy2/6 r4CiGSetu3eDwbFSiouTIlBPLgbOXA2NliV8gfz1dztaNxexIvsreWIAAeQWiaCp9BEj lSXCNggAQoHfZw12gR6vAejVXYU+0VfhPfYl/j1vO8BZb6qQxKTvo3DO2qFC4eMFMPd0 XbCJxUFtgbQyYZBttFEENKk6TuUj6CLew7rrhV8sXfZipsxBrgR1Kf4Dqs9BZjiaa+MC 409/hudkyydEeTBL3jRj7mQ6cH1uHTdhgquXFHml8YU2PdJELUe8FYPuJHXcJD85vaEB lqcw== X-Gm-Message-State: AOJu0Yzm1pYJikZy++Y5un0T8x/QGsnVCD51BDHthX5RxXKB5dPs7h81 jUHDh2+0EyWeYyArd2bN/1TXkK9gSWzVrp85aGKB/hqVWe1x88sS+2Xd3/5hUy0= X-Google-Smtp-Source: AGHT+IHLN1+buLQoJe3CsC2pPffF77XR7b5tioLXgDk6AGjLOjLn7IfqgbCRr2RPM7DQKSSBIXuwrQ== X-Received: by 2002:a1c:4b19:0:b0:40e:46b6:bc48 with SMTP id y25-20020a1c4b19000000b0040e46b6bc48mr5604406wma.41.1706575830411; Mon, 29 Jan 2024 16:50:30 -0800 (PST) Received: from airbuntu ([213.122.231.14]) by smtp.gmail.com with ESMTPSA id c11-20020a05600c0a4b00b0040ebf603a89sm15065440wmq.11.2024.01.29.16.50.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 29 Jan 2024 16:50:29 -0800 (PST) Date: Tue, 30 Jan 2024 00:50:28 +0000 From: Qais Yousef To: Vincent Guittot Cc: linux@armlinux.org.uk, catalin.marinas@arm.com, will@kernel.org, sudeep.holla@arm.com, rafael@kernel.org, viresh.kumar@linaro.org, agross@kernel.org, andersson@kernel.org, konrad.dybcio@linaro.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, lukasz.luba@arm.com, rui.zhang@intel.com, mhiramat@kernel.org, daniel.lezcano@linaro.org, amit.kachhap@gmail.com, corbet@lwn.net, gregkh@linuxfoundation.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linux-doc@vger.kernel.org Subject: Re: [PATCH v4 2/5] sched: Take cpufreq feedback into account Message-ID: <20240130005028.vbqg27ctmanxsej6@airbuntu> References: <20240109164655.626085-1-vincent.guittot@linaro.org> <20240109164655.626085-3-vincent.guittot@linaro.org> <20240130002652.ipdyqs3sjy6qqt6t@airbuntu> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20240130002652.ipdyqs3sjy6qqt6t@airbuntu> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240129_165032_140450_CAE7191D X-CRM114-Status: GOOD ( 33.20 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On 01/30/24 00:26, Qais Yousef wrote: > On 01/09/24 17:46, Vincent Guittot wrote: > > Aggregate the different pressures applied on the capacity of CPUs and > > create a new function that returns the actual capacity of the CPU: > > get_actual_cpu_capacity() > > > > Signed-off-by: Vincent Guittot > > Reviewed-by: Lukasz Luba > > --- > > kernel/sched/fair.c | 45 +++++++++++++++++++++++++-------------------- > > 1 file changed, 25 insertions(+), 20 deletions(-) > > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > index 9cc20855dc2b..e54bbf8b4936 100644 > > --- a/kernel/sched/fair.c > > +++ b/kernel/sched/fair.c > > @@ -4910,13 +4910,22 @@ static inline void util_est_update(struct cfs_rq *cfs_rq, > > trace_sched_util_est_se_tp(&p->se); > > } > > > > +static inline unsigned long get_actual_cpu_capacity(int cpu) > > +{ > > + unsigned long capacity = arch_scale_cpu_capacity(cpu); > > + > > + capacity -= max(thermal_load_avg(cpu_rq(cpu)), cpufreq_get_pressure(cpu)); > > Does cpufreq_get_pressure() reflect thermally throttled frequency, or just the > policy->max being capped by user etc? I didn't see an update to cpufreq when we > topology_update_hw_pressure(). Not sure if it'll go through another path. It is done via the cooling device. And assume any limitations on freq due to power etc are assumed to always to cause the policy->max to change. (sorry if I missed earlier discussions about this) > > maxing with thermal_load_avg() will change the behavior below where we used to > compare against instantaneous pressure. The concern was that it not just can > appear quickly, but disappear quickly too. thermal_load_avg() will decay > slowly, no? This means we'll lose a lot of opportunities for better task > placement until this decays which can take relatively long time. > > So maxing handles the direction where a pressure suddenly appears. But it > doesn't handle where it disappears. > > I suspect your thoughts are that if it was transient then thermal_load_avg() > should be small anyway - which I think makes sense. > > I think we need a comment to explain these nuance differences. > > > + > > + return capacity; > > +} > > + > > static inline int util_fits_cpu(unsigned long util, > > unsigned long uclamp_min, > > unsigned long uclamp_max, > > int cpu) > > { > > - unsigned long capacity_orig, capacity_orig_thermal; > > unsigned long capacity = capacity_of(cpu); > > + unsigned long capacity_orig; > > bool fits, uclamp_max_fits; > > > > /* > > @@ -4948,7 +4957,6 @@ static inline int util_fits_cpu(unsigned long util, > > * goal is to cap the task. So it's okay if it's getting less. > > */ > > capacity_orig = arch_scale_cpu_capacity(cpu); > > - capacity_orig_thermal = capacity_orig - arch_scale_thermal_pressure(cpu); > > > > /* > > * We want to force a task to fit a cpu as implied by uclamp_max. > > @@ -5023,7 +5031,8 @@ static inline int util_fits_cpu(unsigned long util, > > * handle the case uclamp_min > uclamp_max. > > */ > > uclamp_min = min(uclamp_min, uclamp_max); > > - if (fits && (util < uclamp_min) && (uclamp_min > capacity_orig_thermal)) > > + if (fits && (util < uclamp_min) && > > + (uclamp_min > get_actual_cpu_capacity(cpu))) > > return -1; > > > > return fits; > > @@ -7404,7 +7413,7 @@ select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int target) > > * Look for the CPU with best capacity. > > */ > > else if (fits < 0) > > - cpu_cap = arch_scale_cpu_capacity(cpu) - thermal_load_avg(cpu_rq(cpu)); > > + cpu_cap = get_actual_cpu_capacity(cpu); > > > > /* > > * First, select CPU which fits better (-1 being better than 0). > > @@ -7897,8 +7906,8 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu) > > struct root_domain *rd = this_rq()->rd; > > int cpu, best_energy_cpu, target = -1; > > int prev_fits = -1, best_fits = -1; > > - unsigned long best_thermal_cap = 0; > > - unsigned long prev_thermal_cap = 0; > > + unsigned long best_actual_cap = 0; > > + unsigned long prev_actual_cap = 0; > > struct sched_domain *sd; > > struct perf_domain *pd; > > struct energy_env eenv; > > @@ -7928,7 +7937,7 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu) > > > > for (; pd; pd = pd->next) { > > unsigned long util_min = p_util_min, util_max = p_util_max; > > - unsigned long cpu_cap, cpu_thermal_cap, util; > > + unsigned long cpu_cap, cpu_actual_cap, util; > > long prev_spare_cap = -1, max_spare_cap = -1; > > unsigned long rq_util_min, rq_util_max; > > unsigned long cur_delta, base_energy; > > @@ -7940,18 +7949,17 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu) > > if (cpumask_empty(cpus)) > > continue; > > > > - /* Account thermal pressure for the energy estimation */ > > + /* Account external pressure for the energy estimation */ > > cpu = cpumask_first(cpus); > > - cpu_thermal_cap = arch_scale_cpu_capacity(cpu); > > - cpu_thermal_cap -= arch_scale_thermal_pressure(cpu); > > + cpu_actual_cap = get_actual_cpu_capacity(cpu); > > > > - eenv.cpu_cap = cpu_thermal_cap; > > + eenv.cpu_cap = cpu_actual_cap; > > eenv.pd_cap = 0; > > > > for_each_cpu(cpu, cpus) { > > struct rq *rq = cpu_rq(cpu); > > > > - eenv.pd_cap += cpu_thermal_cap; > > + eenv.pd_cap += cpu_actual_cap; > > > > if (!cpumask_test_cpu(cpu, sched_domain_span(sd))) > > continue; > > @@ -8022,7 +8030,7 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu) > > if (prev_delta < base_energy) > > goto unlock; > > prev_delta -= base_energy; > > - prev_thermal_cap = cpu_thermal_cap; > > + prev_actual_cap = cpu_actual_cap; > > best_delta = min(best_delta, prev_delta); > > } > > > > @@ -8037,7 +8045,7 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu) > > * but best energy cpu has better capacity. > > */ > > if ((max_fits < 0) && > > - (cpu_thermal_cap <= best_thermal_cap)) > > + (cpu_actual_cap <= best_actual_cap)) > > continue; > > > > cur_delta = compute_energy(&eenv, pd, cpus, p, > > @@ -8058,14 +8066,14 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu) > > best_delta = cur_delta; > > best_energy_cpu = max_spare_cap_cpu; > > best_fits = max_fits; > > - best_thermal_cap = cpu_thermal_cap; > > + best_actual_cap = cpu_actual_cap; > > } > > } > > rcu_read_unlock(); > > > > if ((best_fits > prev_fits) || > > ((best_fits > 0) && (best_delta < prev_delta)) || > > - ((best_fits < 0) && (best_thermal_cap > prev_thermal_cap))) > > + ((best_fits < 0) && (best_actual_cap > prev_actual_cap))) > > target = best_energy_cpu; > > > > return target; > > @@ -9441,8 +9449,8 @@ static inline void init_sd_lb_stats(struct sd_lb_stats *sds) > > > > static unsigned long scale_rt_capacity(int cpu) > > { > > + unsigned long max = get_actual_cpu_capacity(cpu); > > struct rq *rq = cpu_rq(cpu); > > - unsigned long max = arch_scale_cpu_capacity(cpu); > > unsigned long used, free; > > unsigned long irq; > > > > @@ -9454,12 +9462,9 @@ static unsigned long scale_rt_capacity(int cpu) > > /* > > * avg_rt.util_avg and avg_dl.util_avg track binary signals > > * (running and not running) with weights 0 and 1024 respectively. > > - * avg_thermal.load_avg tracks thermal pressure and the weighted > > - * average uses the actual delta max capacity(load). > > */ > > used = READ_ONCE(rq->avg_rt.util_avg); > > used += READ_ONCE(rq->avg_dl.util_avg); > > - used += thermal_load_avg(rq); > > > > if (unlikely(used >= max)) > > return 1; > > -- > > 2.34.1 > > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel