From: Hongyan Xia <hongyan.xia2@arm.com>
To: Qais Yousef <qyousef@layalina.io>, Ingo Molnar <mingo@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
"Rafael J. Wysocki" <rafael@kernel.org>,
Viresh Kumar <viresh.kumar@linaro.org>,
Vincent Guittot <vincent.guittot@linaro.org>,
Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org,
Lukasz Luba <lukasz.luba@arm.com>, Wei Wang <wvw@google.com>,
Rick Yiu <rickyiu@google.com>,
Chung-Kai Mei <chungkai@google.com>
Subject: Re: [PATCH 1/4] sched/fair: Be less aggressive in calling cpufreq_update_util()
Date: Tue, 12 Dec 2023 10:47:19 +0000 [thread overview]
Message-ID: <d73ea0ed-ecd8-4c70-b02e-f6fcd2cd7538@arm.com> (raw)
In-Reply-To: <20231208015242.385103-2-qyousef@layalina.io>
On 08/12/2023 01:52, Qais Yousef wrote:
> Due to the way code is structured, it makes a lot of sense to trigger
> cpufreq_update_util() from update_load_avg(). But this is too aggressive
> as in most cases we are iterating through entities in a loop to
> update_load_avg() in the hierarchy. So we end up sending too many
> request in an loop as we're updating the hierarchy.
Do you mean the for_each_sched_entity(se) loop? I think we update CPU
frequency only once at the root CFS?
> Combine this with the rate limit in schedutil, we could end up
> prematurely send up a wrong frequency update before we have actually
> updated all entities appropriately.
>
> Be smarter about it by limiting the trigger to perform frequency updates
> after all accounting logic has done. This ended up being in the
> following points:
>
> 1. enqueue/dequeue_task_fair()
> 2. throttle/unthrottle_cfs_rq()
> 3. attach/detach_task_cfs_rq()
> 4. task_tick_fair()
> 5. __sched_group_set_shares()
>
> This is not 100% ideal still due to other limitations that might be
> a bit harder to handle. Namely we can end up with premature update
> request in the following situations:
>
> a. Simultaneous task enqueue on the CPU where 2nd task is bigger and
> requires higher freq. The trigger to cpufreq_update_util() by the
> first task will lead to dropping the 2nd request until tick. Or
> another CPU in the same policy trigger a freq update.
>
> b. CPUs sharing a policy can end up with the same race in a but the
> simultaneous enqueue happens on different CPUs in the same policy.
>
> The above though are limitations in the governor/hardware, and from
> scheduler point of view at least that's the best we can do. The
> governor might consider smarter logic to aggregate near simultaneous
> request and honour the higher one.
>
> Signed-off-by: Qais Yousef (Google) <qyousef@layalina.io>
> ---
> kernel/sched/fair.c | 55 ++++++++++++---------------------------------
> 1 file changed, 14 insertions(+), 41 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index b83448be3f79..f99910fc6705 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -3997,29 +3997,6 @@ static inline void update_cfs_group(struct sched_entity *se)
> }
> #endif /* CONFIG_FAIR_GROUP_SCHED */
>
> -static inline void cfs_rq_util_change(struct cfs_rq *cfs_rq, int flags)
> -{
> - struct rq *rq = rq_of(cfs_rq);
> -
> - if (&rq->cfs == cfs_rq) {
Here. I think this restricts frequency updates to the root CFS?
> - /*
> - * There are a few boundary cases this might miss but it should
> - * get called often enough that that should (hopefully) not be
> - * a real problem.
> - *
> - * It will not get called when we go idle, because the idle
> - * thread is a different class (!fair), nor will the utilization
> - * number include things like RT tasks.
> - *
> - * As is, the util number is not freq-invariant (we'd have to
> - * implement arch_scale_freq_capacity() for that).
> - *
> - * See cpu_util_cfs().
> - */
> - cpufreq_update_util(rq, flags);
> - }
> -}
> -
> #ifdef CONFIG_SMP
> static inline bool load_avg_is_decayed(struct sched_avg *sa)
> {
> [...]
next prev parent reply other threads:[~2023-12-12 10:47 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-08 1:52 [PATCH 0/4] sched: cpufreq: Remove uclamp max-aggregation Qais Yousef
2023-12-08 1:52 ` [PATCH 1/4] sched/fair: Be less aggressive in calling cpufreq_update_util() Qais Yousef
2023-12-08 10:05 ` Lukasz Luba
2023-12-10 20:51 ` Qais Yousef
2023-12-11 7:56 ` Lukasz Luba
2023-12-12 12:10 ` Qais Yousef
2023-12-14 8:19 ` Lukasz Luba
2023-12-11 18:47 ` Christian Loehle
2023-12-12 12:34 ` Qais Yousef
2023-12-12 13:09 ` Christian Loehle
2023-12-12 13:29 ` Qais Yousef
2023-12-12 10:46 ` Dietmar Eggemann
2023-12-12 12:35 ` Qais Yousef
2023-12-12 18:22 ` Hongyan Xia
2023-12-12 10:47 ` Hongyan Xia [this message]
2023-12-12 11:06 ` Vincent Guittot
2023-12-12 12:40 ` Qais Yousef
2023-12-29 0:25 ` Qais Yousef
2024-01-03 13:41 ` Vincent Guittot
2024-01-04 19:40 ` Qais Yousef
2023-12-18 8:51 ` Dietmar Eggemann
2023-12-17 21:44 ` Qais Yousef
2023-12-08 1:52 ` [PATCH 2/4] sched/uclamp: Remove rq max aggregation Qais Yousef
2023-12-11 0:08 ` Qais Yousef
2023-12-08 1:52 ` [PATCH 3/4] sched/schedutil: Ignore update requests for short running tasks Qais Yousef
2023-12-08 10:42 ` Hongyan Xia
2023-12-10 22:22 ` Qais Yousef
2023-12-11 11:15 ` Hongyan Xia
2023-12-12 12:23 ` Qais Yousef
2023-12-08 1:52 ` [PATCH 4/4] sched/documentation: Remove reference to max aggregation Qais Yousef
2023-12-18 8:19 ` [PATCH 0/4] sched: cpufreq: Remove uclamp max-aggregation Dietmar Eggemann
2023-12-17 21:23 ` Qais Yousef
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d73ea0ed-ecd8-4c70-b02e-f6fcd2cd7538@arm.com \
--to=hongyan.xia2@arm.com \
--cc=chungkai@google.com \
--cc=dietmar.eggemann@arm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=lukasz.luba@arm.com \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=qyousef@layalina.io \
--cc=rafael@kernel.org \
--cc=rickyiu@google.com \
--cc=vincent.guittot@linaro.org \
--cc=viresh.kumar@linaro.org \
--cc=wvw@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox