From: Qais Yousef <qyousef@layalina.io>
To: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Ingo Molnar <mingo@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
"Rafael J. Wysocki" <rafael@kernel.org>,
Viresh Kumar <viresh.kumar@linaro.org>,
linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org,
Lukasz Luba <lukasz.luba@arm.com>, Wei Wang <wvw@google.com>,
Xuewen Yan <xuewen.yan94@gmail.com>, Hank <han.lin@mediatek.com>,
Jonathan JMChen <Jonathan.JMChen@mediatek.com>
Subject: Re: [RFC PATCH 3/3] sched/fair: Traverse cpufreq policies to detect capacity inversion
Date: Mon, 5 Dec 2022 11:01:59 +0000 [thread overview]
Message-ID: <20221205110159.nd5igwvsaj55jar7@airbuntu> (raw)
In-Reply-To: <CAKfTPtCZYGEvDBe5X1v7TiNZag0atUozGKip6EAgvZDWyo8e-g@mail.gmail.com>
On 12/04/22 12:35, Vincent Guittot wrote:
> On Sat, 3 Dec 2022 at 15:33, Qais Yousef <qyousef@layalina.io> wrote:
> >
> > On 12/02/22 15:57, Vincent Guittot wrote:
> >
> > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > > > index 7c0dd57e562a..4bbbca85134b 100644
> > > > --- a/kernel/sched/fair.c
> > > > +++ b/kernel/sched/fair.c
> > > > @@ -8856,23 +8856,20 @@ static void update_cpu_capacity(struct sched_domain *sd, int cpu)
> > > > * * Thermal pressure will impact all cpus in this perf domain
> > > > * equally.
> > > > */
> > > > - if (sched_energy_enabled()) {
> > > > + if (static_branch_unlikely(&sched_asym_cpucapacity)) {
> > > > unsigned long inv_cap = capacity_orig - thermal_load_avg(rq);
> > > > - struct perf_domain *pd = rcu_dereference(rq->rd->pd);
> > > > + struct cpufreq_policy *policy, __maybe_unused *policy_n;
> > > >
> > > > rq->cpu_capacity_inverted = 0;
> > > >
> > > > - SCHED_WARN_ON(!rcu_read_lock_held());
> > > > -
> > > > - for (; pd; pd = pd->next) {
> > > > - struct cpumask *pd_span = perf_domain_span(pd);
> > > > + for_each_active_policy_safe(policy, policy_n) {
> > >
> > > So you are looping all cpufreq policy (and before the perf domain) in
> > > the period load balance. That' really not something we should or want
> > > to do
> >
> > Why is it not acceptable in the period load balance but acceptable in the hot
> > wake up path in feec()? What's the difference?
>
> This patch loops on all cpufreq policy in sched softirq, how can this
> be sane ? and not only in eas mode but also in the default asymmetric
Hmm I'm still puzzled. Why it's not sane to do it here but it's okay to do it
in the wake up path in feec()?
AFAICT the number of cpufreq policies and perf domains have 1:1 mapping. In
feec() we not only loop perf domains, but we go through each cpu of each domain
to find max_spare_cap then compute_energy(). Which is more intensive.
In worst case scenario where there's a perf domain for each cpu, then we'll
loop through every CPU *and* compute energy its energy cost in the wake up
path.
Why it's deemed acceptable in wake up path which is more critical AFAIU but not
period load balance which is expected to do more work? What am I missing?
> performance one.
>
> This inverted detection doesn't look like the right way to fix your
> problem IMO. That being said, i agree that I haven't made any other
> proposal apart that I think that you should use a different rules for
> task and for overutilized and part of your problem comes from this.
We discussed this before; I need to revisit the thread but I can't see how
overutilized different than task will fix the issue. They should be unified by
design.
I'm all ears if there's a simpler way to address the problem :-)
The problem is that thermal pressure on big cpu is not important from
uclamp perspective until it is in inversion state. It is quite common to have
a system where the medium capacity is in 500 range. If the big is under thermal
pressure that it drops to 800, then it is still a fitting CPU from uclamp
perspective. Keep in mind uclamp_min is useful for tasks whose utilization is
small So we need to be selective when thermal pressure is actually helping out
or just creating unnecessary problems.
The only other option I had in mind was to do the detection when we update the
thermal_pressure in the topology code. But that didn't look better alternative
to me.
>
> Then this make eas and util_fits_cpu even more Arm specific and I
What is the Arm specific part about it? Why it wouldn't work on non-Arm
systems?
> still hope to merge sched_asym_cpucapacity and asym_packing a some
> levels because they looks more and more similar but each side is
> trying to add some SoC specific policy
Oh, it seems Intel relies on asym_packing for their hybrid support approach?
I think sched_asym_cpucapacity was designed to be generic. If I gathered
correctly lack of support for SMT and inability to provide energy model outside
of DT were some required extensions.
To be honest, I personally think EAS can be useful on SMP systems and it would
be nice to enable it outside of sched_asym_cpucapacity.
I'm interested to hear more about this unification idea actually. If you feel
a bit chatty to describe in more detail how do you see this being unified, that
could be enlightening for some of us who work in this area :-)
Thanks!
--
Qais Yousef
next prev parent reply other threads:[~2022-12-05 11:02 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-27 14:17 [PATCH 0/3] Fixes for uclamp and capacity inversion detection Qais Yousef
2022-11-27 14:17 ` [PATCH 1/3] sched/uclamp: Fix a uninitialized variable warnings Qais Yousef
2022-12-01 22:38 ` Dietmar Eggemann
2022-12-03 14:32 ` Qais Yousef
2022-12-08 14:51 ` [PATCH v2] " Qais Yousef
2022-11-27 14:17 ` [PATCH 2/3] sched/fair: Fixes for capacity inversion detection Qais Yousef
2022-12-01 22:39 ` Dietmar Eggemann
2022-12-03 14:32 ` Qais Yousef
2022-12-08 14:54 ` [PATCH v2] " Qais Yousef
2022-12-08 14:58 ` Qais Yousef
2022-11-27 14:17 ` [RFC PATCH 3/3] sched/fair: Traverse cpufreq policies to detect capacity inversion Qais Yousef
2022-11-30 18:27 ` Rafael J. Wysocki
2022-12-03 14:32 ` Qais Yousef
2022-12-05 12:39 ` Rafael J. Wysocki
2022-12-05 14:09 ` Qais Yousef
2022-12-02 14:57 ` Vincent Guittot
2022-12-03 14:33 ` Qais Yousef
2022-12-04 11:35 ` Vincent Guittot
2022-12-05 11:01 ` Qais Yousef [this message]
2022-12-06 18:12 ` Vincent Guittot
2022-12-08 14:05 ` Qais Yousef
2022-12-09 16:47 ` Vincent Guittot
2022-12-12 18:43 ` Qais Yousef
2022-12-13 17:38 ` Dietmar Eggemann
2022-12-15 17:46 ` Vincent Guittot
2022-12-20 11:50 ` Qais Yousef
2022-12-13 17:42 ` Lukasz Luba
2022-12-20 11:51 ` Qais Yousef
2022-12-20 12:52 ` Lukasz Luba
2022-12-15 17:39 ` Vincent Guittot
2022-12-20 12:32 ` Qais Yousef
2022-12-20 13:50 ` Vincent Guittot
2022-12-23 11:58 ` Qais Yousef
2022-12-27 13:33 ` Vincent Guittot
2022-12-28 17:18 ` Vincent Guittot
2023-01-09 16:40 ` Qais Yousef
2023-01-10 16:38 ` Vincent Guittot
2023-01-10 16:44 ` Qais Yousef
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20221205110159.nd5igwvsaj55jar7@airbuntu \
--to=qyousef@layalina.io \
--cc=Jonathan.JMChen@mediatek.com \
--cc=dietmar.eggemann@arm.com \
--cc=han.lin@mediatek.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=lukasz.luba@arm.com \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=rafael@kernel.org \
--cc=vincent.guittot@linaro.org \
--cc=viresh.kumar@linaro.org \
--cc=wvw@google.com \
--cc=xuewen.yan94@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox