public inbox for linux-pm@vger.kernel.org
 help / color / mirror / Atom feed
From: Hongyan Xia <hongyan.xia2@arm.com>
To: Qais Yousef <qyousef@layalina.io>, Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org,
	Lukasz Luba <lukasz.luba@arm.com>, Wei Wang <wvw@google.com>,
	Rick Yiu <rickyiu@google.com>,
	Chung-Kai Mei <chungkai@google.com>
Subject: Re: [PATCH 1/4] sched/fair: Be less aggressive in calling cpufreq_update_util()
Date: Tue, 12 Dec 2023 10:47:19 +0000	[thread overview]
Message-ID: <d73ea0ed-ecd8-4c70-b02e-f6fcd2cd7538@arm.com> (raw)
In-Reply-To: <20231208015242.385103-2-qyousef@layalina.io>

On 08/12/2023 01:52, Qais Yousef wrote:
> Due to the way code is structured, it makes a lot of sense to trigger
> cpufreq_update_util() from update_load_avg(). But this is too aggressive
> as in most cases we are iterating through entities in a loop to
> update_load_avg() in the hierarchy. So we end up sending too many
> request in an loop as we're updating the hierarchy.

Do you mean the for_each_sched_entity(se) loop? I think we update CPU 
frequency only once at the root CFS?

> Combine this with the rate limit in schedutil, we could end up
> prematurely send up a wrong frequency update before we have actually
> updated all entities appropriately.
> 
> Be smarter about it by limiting the trigger to perform frequency updates
> after all accounting logic has done. This ended up being in the
> following points:
> 
> 1. enqueue/dequeue_task_fair()
> 2. throttle/unthrottle_cfs_rq()
> 3. attach/detach_task_cfs_rq()
> 4. task_tick_fair()
> 5. __sched_group_set_shares()
> 
> This is not 100% ideal still due to other limitations that might be
> a bit harder to handle. Namely we can end up with premature update
> request in the following situations:
> 
> a. Simultaneous task enqueue on the CPU where 2nd task is bigger and
>     requires higher freq. The trigger to cpufreq_update_util() by the
>     first task will lead to dropping the 2nd request until tick. Or
>     another CPU in the same policy trigger a freq update.
> 
> b. CPUs sharing a policy can end up with the same race in a but the
>     simultaneous enqueue happens on different CPUs in the same policy.
> 
> The above though are limitations in the governor/hardware, and from
> scheduler point of view at least that's the best we can do. The
> governor might consider smarter logic to aggregate near simultaneous
> request and honour the higher one.
> 
> Signed-off-by: Qais Yousef (Google) <qyousef@layalina.io>
> ---
>   kernel/sched/fair.c | 55 ++++++++++++---------------------------------
>   1 file changed, 14 insertions(+), 41 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index b83448be3f79..f99910fc6705 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -3997,29 +3997,6 @@ static inline void update_cfs_group(struct sched_entity *se)
>   }
>   #endif /* CONFIG_FAIR_GROUP_SCHED */
>   
> -static inline void cfs_rq_util_change(struct cfs_rq *cfs_rq, int flags)
> -{
> -	struct rq *rq = rq_of(cfs_rq);
> -
> -	if (&rq->cfs == cfs_rq) {

Here. I think this restricts frequency updates to the root CFS?

> -		/*
> -		 * There are a few boundary cases this might miss but it should
> -		 * get called often enough that that should (hopefully) not be
> -		 * a real problem.
> -		 *
> -		 * It will not get called when we go idle, because the idle
> -		 * thread is a different class (!fair), nor will the utilization
> -		 * number include things like RT tasks.
> -		 *
> -		 * As is, the util number is not freq-invariant (we'd have to
> -		 * implement arch_scale_freq_capacity() for that).
> -		 *
> -		 * See cpu_util_cfs().
> -		 */
> -		cpufreq_update_util(rq, flags);
> -	}
> -}
> -
>   #ifdef CONFIG_SMP
>   static inline bool load_avg_is_decayed(struct sched_avg *sa)
>   {
> [...]

  parent reply	other threads:[~2023-12-12 10:47 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-08  1:52 [PATCH 0/4] sched: cpufreq: Remove uclamp max-aggregation Qais Yousef
2023-12-08  1:52 ` [PATCH 1/4] sched/fair: Be less aggressive in calling cpufreq_update_util() Qais Yousef
2023-12-08 10:05   ` Lukasz Luba
2023-12-10 20:51     ` Qais Yousef
2023-12-11  7:56       ` Lukasz Luba
2023-12-12 12:10         ` Qais Yousef
2023-12-14  8:19           ` Lukasz Luba
2023-12-11 18:47   ` Christian Loehle
2023-12-12 12:34     ` Qais Yousef
2023-12-12 13:09       ` Christian Loehle
2023-12-12 13:29         ` Qais Yousef
2023-12-12 10:46   ` Dietmar Eggemann
2023-12-12 12:35     ` Qais Yousef
2023-12-12 18:22       ` Hongyan Xia
2023-12-12 10:47   ` Hongyan Xia [this message]
2023-12-12 11:06   ` Vincent Guittot
2023-12-12 12:40     ` Qais Yousef
2023-12-29  0:25       ` Qais Yousef
2024-01-03 13:41         ` Vincent Guittot
2024-01-04 19:40           ` Qais Yousef
2023-12-18  8:51   ` Dietmar Eggemann
2023-12-17 21:44     ` Qais Yousef
2023-12-08  1:52 ` [PATCH 2/4] sched/uclamp: Remove rq max aggregation Qais Yousef
2023-12-11  0:08   ` Qais Yousef
2023-12-08  1:52 ` [PATCH 3/4] sched/schedutil: Ignore update requests for short running tasks Qais Yousef
2023-12-08 10:42   ` Hongyan Xia
2023-12-10 22:22     ` Qais Yousef
2023-12-11 11:15       ` Hongyan Xia
2023-12-12 12:23         ` Qais Yousef
2023-12-08  1:52 ` [PATCH 4/4] sched/documentation: Remove reference to max aggregation Qais Yousef
2023-12-18  8:19 ` [PATCH 0/4] sched: cpufreq: Remove uclamp max-aggregation Dietmar Eggemann
2023-12-17 21:23   ` Qais Yousef

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d73ea0ed-ecd8-4c70-b02e-f6fcd2cd7538@arm.com \
    --to=hongyan.xia2@arm.com \
    --cc=chungkai@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=lukasz.luba@arm.com \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=qyousef@layalina.io \
    --cc=rafael@kernel.org \
    --cc=rickyiu@google.com \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    --cc=wvw@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox