From: Dietmar Eggemann <dietmar.eggemann@arm.com>
To: Qais Yousef <qyousef@layalina.io>, Ingo Molnar <mingo@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Vincent Guittot <vincent.guittot@linaro.org>,
"Rafael J. Wysocki" <rafael@kernel.org>,
Viresh Kumar <viresh.kumar@linaro.org>
Cc: Juri Lelli <juri.lelli@redhat.com>,
Steven Rostedt <rostedt@goodmis.org>,
John Stultz <jstultz@google.com>,
linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 12/16] sched/pelt: Add new waiting_avg to record when runnable && !running
Date: Wed, 18 Sep 2024 09:01:38 +0200 [thread overview]
Message-ID: <7806add6-8b3b-4dc6-b36c-4e7e23493a26@arm.com> (raw)
In-Reply-To: <20240820163512.1096301-13-qyousef@layalina.io>
On 20/08/2024 18:35, Qais Yousef wrote:
> This info will be useful to understand how long tasks end up waiting
> behind other tasks. This info is recorded for tasks only, and
> added/subtracted from root cfs_rq on __update_load_avg_se().
>
> It also helps to decouple util_avg which indicates tasks computational
> demand from the fact that the CPU might need to run faster to reduce the
> waiting time. It has been a point of confusion in the past while
> discussing uclamp and util_avg and the fact that not keeping freq high
> means tasks will take longer to run and cause delays. Isolating the
> source of delay into its own signal would be a better way to take this
> source of delay into account when making decisions independently of
> task's/CPU's computational demands.
>
> It is not used now. But will be used later to help drive DVFS headroom.
> It could become a helpful metric to help us manage waiting latencies in
> general, for example in load balance.
>
> TODO: waiting_avg should use rq_clock_task() as it doesn't care about
> invariance. Waiting time should reflect actual wait in realtime as this
> is the measure of latency that users care about.
Since you use PELT for the update, you're bound to use rq_clock_pelt().
If we could have PELT with two time values, then we could have
'util_avg' and 'invariant util_avg' to cure the slow ramp-up on tiny CPU
and/or low OPPs and we wouldn't have to add all of this extra code.
[...]
> @@ -4744,8 +4760,15 @@ static inline void update_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *s
> * Track task load average for carrying it to new CPU after migrated, and
> * track group sched_entity load average for task_h_load calculation in migration
> */
> - if (se->avg.last_update_time && !(flags & SKIP_AGE_LOAD))
> + if (se->avg.last_update_time && !(flags & SKIP_AGE_LOAD)) {
> + bool update_rq_waiting_avg = entity_is_task(se) && se_runnable(se);
> +
> + if (update_rq_waiting_avg)
> + sub_waiting_avg(&rq_of(cfs_rq)->cfs, se);
> __update_load_avg_se(now, cfs_rq, se);
> + if (update_rq_waiting_avg)
> + add_waiting_avg(&rq_of(cfs_rq)->cfs, se);
> + }
That's a pretty convoluted design. util_est-style attach/detach within
the PELT update but only for tasks and not all se's.
Doesn't 'p->se.avg.runnable_avg - p->se.avg.util_avg' give you what you
want? It's invariant but so is this here.
Commit 50181c0cff31 ("sched/pelt: Avoid underestimation of task
utilization") uses some of it already.
+ /*
+ * To avoid underestimate of task utilization, skip updates of EWMA if
+ * we cannot grant that thread got all CPU time it wanted.
+ */
+ if ((ue.enqueued + UTIL_EST_MARGIN) < task_runnable(p))
+ goto done;
[...]
> @@ -6786,6 +6814,7 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
> * estimated utilization, before we update schedutil.
> */
> util_est_enqueue(&rq->cfs, p);
> + add_waiting_avg(&rq->cfs, se);
This would also have to be checked against the new p->se.sched_delayed
thing.
> /*
> * If in_iowait is set, the code below may not trigger any cpufreq
> @@ -6874,6 +6903,7 @@ static void dequeue_task_fair(struct rq *rq, struct task_struct *p, int flags)
> bool was_sched_idle = sched_idle_rq(rq);
>
> util_est_dequeue(&rq->cfs, p);
> + sub_waiting_avg(&rq->cfs, se);
^^
This won't compile. se vs. &p->se
[...]
next prev parent reply other threads:[~2024-09-18 7:01 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-20 16:34 [RFC PATCH 00/16] sched/fair/schedutil: Better manage system response time Qais Yousef
2024-08-20 16:34 ` [RFC PATCH 01/16] sched: cpufreq: Rename map_util_perf to sugov_apply_dvfs_headroom Qais Yousef
2024-08-20 16:34 ` [RFC PATCH 02/16] sched/pelt: Add a new function to approximate the future util_avg value Qais Yousef
2024-08-20 16:34 ` [RFC PATCH 03/16] sched/pelt: Add a new function to approximate runtime to reach given util Qais Yousef
2024-08-22 5:36 ` Sultan Alsawaf (unemployed)
2024-09-16 15:31 ` Christian Loehle
2024-08-20 16:35 ` [RFC PATCH 04/16] sched/fair: Remove magic hardcoded margin in fits_capacity() Qais Yousef
2024-08-22 5:09 ` Sultan Alsawaf (unemployed)
2024-09-17 19:41 ` Dietmar Eggemann
2024-08-20 16:35 ` [RFC PATCH 05/16] sched: cpufreq: Remove magic 1.25 headroom from sugov_apply_dvfs_headroom() Qais Yousef
2024-11-13 4:51 ` John Stultz
2024-08-20 16:35 ` [RFC PATCH 06/16] sched/schedutil: Add a new tunable to dictate response time Qais Yousef
2024-09-16 22:22 ` Dietmar Eggemann
2024-09-17 10:22 ` Christian Loehle
2024-08-20 16:35 ` [RFC PATCH 07/16] sched/pelt: Introduce PELT multiplier boot time parameter Qais Yousef
2024-08-20 16:35 ` [RFC PATCH 08/16] sched/fair: Extend util_est to improve rampup time Qais Yousef
2024-09-17 19:21 ` Dietmar Eggemann
2024-10-14 16:04 ` Christian Loehle
2024-08-20 16:35 ` [RFC PATCH 09/16] sched/fair: util_est: Take into account periodic tasks Qais Yousef
2024-11-13 4:57 ` John Stultz
2024-08-20 16:35 ` [RFC PATCH 10/16] sched/qos: Add a new sched-qos interface Qais Yousef
2024-11-28 1:47 ` John Stultz
2024-08-20 16:35 ` [RFC PATCH 11/16] sched/qos: Add rampup multiplier QoS Qais Yousef
2024-09-17 20:09 ` Dietmar Eggemann
2024-09-17 21:43 ` Ricardo Neri
2024-09-18 21:21 ` Ricardo Neri
2024-10-14 16:06 ` Christian Loehle
2024-11-28 0:12 ` John Stultz
2024-08-20 16:35 ` [RFC PATCH 12/16] sched/pelt: Add new waiting_avg to record when runnable && !running Qais Yousef
2024-09-18 7:01 ` Dietmar Eggemann [this message]
2024-08-20 16:35 ` [RFC PATCH 13/16] sched/schedutil: Take into account waiting_avg in apply_dvfs_headroom Qais Yousef
2024-08-20 16:35 ` [RFC PATCH 14/16] sched/schedutil: Ignore dvfs headroom when util is decaying Qais Yousef
2024-08-22 5:29 ` Sultan Alsawaf (unemployed)
2024-09-18 10:40 ` Christian Loehle
2024-08-20 16:35 ` [RFC PATCH 15/16] sched/fair: Enable disabling util_est via rampup_multiplier Qais Yousef
2024-08-20 16:35 ` [RFC PATCH 16/16] sched/fair: Don't mess with util_avg post init Qais Yousef
2024-09-16 12:21 ` [RFC PATCH 00/16] sched/fair/schedutil: Better manage system response time Dietmar Eggemann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7806add6-8b3b-4dc6-b36c-4e7e23493a26@arm.com \
--to=dietmar.eggemann@arm.com \
--cc=jstultz@google.com \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=qyousef@layalina.io \
--cc=rafael@kernel.org \
--cc=rostedt@goodmis.org \
--cc=vincent.guittot@linaro.org \
--cc=viresh.kumar@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox