From: Qais Yousef <qais.yousef@arm.com>
To: Lukasz Luba <lukasz.luba@arm.com>
Cc: Xuewen Yan <xuewen.yan@unisoc.com>,
dietmar.eggemann@arm.com, rafael@kernel.org,
viresh.kumar@linaro.org, mingo@redhat.com, peterz@infradead.org,
vincent.guittot@linaro.org, rostedt@goodmis.org,
linux-kernel@vger.kernel.org, di.shen@unisoc.com,
Xuewen Yan <xuewen.yan94@gmail.com>
Subject: Re: [PATCH] sched: Take thermal pressure into account when determine rt fits capacity
Date: Tue, 10 May 2022 19:44:36 +0100 [thread overview]
Message-ID: <20220510184436.fdgzzcfqqevinx5p@wubuntu> (raw)
In-Reply-To: <37357c86-bab7-d0c7-88d0-ace63ccdb6c8@arm.com>
On 05/10/22 18:44, Lukasz Luba wrote:
[...]
> To properly answer this question we probably have to analyze the timings
> and this update path - how often it is actually called. Keep in mind
> we are going to solve CPU capacity inversion for RT class, which
> contains latency sensitive tasks. In this approach the information
This was an attempt for a generic inversion detection. We update
rq->cpu_capacity which is used by capacity_of() in the same path.
I didn't feel brave to write a quick patch in the topology code, but we can
certainly do the detection there in topology_update_thermal_pressure().
> about HW status is coming from this CFS load balance path.
> What if that load balance is not called that often as RT might require?
> What if there is a light load on CPUs, but GPU caused them to throttle,
> reducing capacity by a decent chunk e.g. 50%?
> That would translate to some RT periodic task which takes 2ms every
> 8ms to take 4ms, while maybe on other less power hungry CPU it could
> take 3ms.
>
> The usage of thermal_load_avg() in the scale_rt_capacity() looks OK
> for the CFS, but might not be from the RT class point of view.
> The RT class might want to realize faster that CPUs have changed the
> capacity.
> Maybe it's OK with that patch [1] and boot config shifter=-5, but in
> default boot config for shifter=0 we can suffer for hundreds of ms
> running on lower capacity cpu (which is quite high number of frames
> nowadays).
>
> Without a research and experiments data I'm afraid this is too
> big step to make, with this CFS load balance path.
I think Xuewen didn't want to use thermal_load_avg(), and that's the question
I deferred.
>
> >
> > >
> > > > +
> > > > + rq->cpu_capacity_inverted = 0;
> > > > +
> > > > + for_each_possible_cpu(cpu) {
> > > > + unsigned long cap = arch_scale_cpu_capacity(cpu);
> > > > +
> > > > + if (capacity_orig <= cap)
> > > > + continue;
>
> The search loop here assumes that other CPUs (fortunately not in the
> same freq domain) don't suffer due to reduced capacity. This might be
> not true - when we have ~1 Watt budget for all CPUs in the system and
> single big core can use 3-4W at max or single mid core ~1.2W.
I defined capacity inversion against capacity_orig. IMHO that's the sensible
definition to make.
Would be good to hear more/other suggestions.
>
> > > > +
> > > > + if (cap > inv_cap) {
> > > > + rq->cpu_capacity_inverted = inv_cap;
> > > > + break;
> > > > + }
> > > > + }
> > > > +
> > > > + }
> > > >
> > > > sdg->sgc->capacity = capacity;
> > > > sdg->sgc->min_capacity = capacity;
> > > > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> > > > index 8dccb34eb190..bfe84c870bf9 100644
> > > > --- a/kernel/sched/sched.h
> > > > +++ b/kernel/sched/sched.h
> > > > @@ -992,6 +992,7 @@ struct rq {
> > > >
> > > > unsigned long cpu_capacity;
> > > > unsigned long cpu_capacity_orig;
> > > > + unsigned long cpu_capacity_inverted;
> > > >
> > > > struct callback_head *balance_callback;
> > > >
> > > > @@ -2807,6 +2808,11 @@ static inline unsigned long capacity_orig_of(int cpu)
> > > > return cpu_rq(cpu)->cpu_capacity_orig;
> > > > }
> > > >
> > > > +static inline unsigned long cpu_in_capacity_inversion(int cpu)
> > > > +{
> > > > + return cpu_rq(cpu)->cpu_capacity_inverted;
> > > > +}
> > > > +
> > > > /**
> > > > * enum cpu_util_type - CPU utilization type
> > > > * @FREQUENCY_UTIL: Utilization used to select frequency
> > > >
> > > >
> > > > --->8---
> > >
> > > The patch is amazing for me, and the complexity is not too high. Would
> > > you please push the patch?
> > > I think the idea is yours, I don't want to use it as my patch v2.
> >
> > I'd be happy to add a commit message so that you can include it in your v2.
> >
> > First, I'd like to hear from Vincent and Lukasz they're happy with this
> > approach.
> >
> > I've been trying to think how we can do this generically but can't find an
> > alternative to the extra loop or additional fallback_cpu_mask. Maybe the mask
> > is okay if we protect it with sched_asymmetric_cpucapacity static key..
> >
>
> I'm sorry Qais, I see that you are trying to bring this
> real-CPU-capacity information into RT, but the source and quality of
> this information IMO might matter. I cannot help you w/o experiment
> results of your proposed approach.
The question I was posing here is whether to handle thermal only in inversion
case as I was suggesting or do better. We are still trickling through the
details, but first, I wanted to make sure there's no objection to this
direction (detect inversion and bail out in rt_task_fits_capacity() for cpus in
capacity inversion).
Cheers
--
Qais Yousef
next prev parent reply other threads:[~2022-05-10 18:44 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-07 5:19 [PATCH] sched: Take thermal pressure into account when determine rt fits capacity Xuewen Yan
2022-04-11 8:21 ` Dietmar Eggemann
2022-04-11 8:52 ` Xuewen Yan
2022-04-11 14:07 ` Dietmar Eggemann
2022-04-13 13:25 ` Lukasz Luba
2022-04-16 2:47 ` Xuewen Yan
2022-04-19 7:14 ` Vincent Guittot
2022-04-19 12:01 ` Lukasz Luba
2022-04-19 12:51 ` Vincent Guittot
2022-04-19 14:13 ` Lukasz Luba
2022-04-21 8:29 ` Vincent Guittot
2022-04-21 10:57 ` Lukasz Luba
2022-04-26 7:39 ` Vincent Guittot
2022-04-29 9:27 ` Lukasz Luba
2022-04-20 13:51 ` Qais Yousef
2022-04-21 8:07 ` Xuewen Yan
2022-04-21 16:15 ` Qais Yousef
2022-04-25 1:31 ` Xuewen Yan
2022-04-25 16:12 ` Qais Yousef
2022-04-26 2:07 ` Xuewen Yan
2022-04-26 8:09 ` Vincent Guittot
2022-04-26 9:30 ` Qais Yousef
2022-04-26 10:06 ` Vincent Guittot
2022-04-26 13:06 ` Qais Yousef
2022-04-26 9:21 ` Qais Yousef
2022-04-27 1:38 ` Xuewen Yan
2022-04-27 10:58 ` Qais Yousef
2022-05-01 3:20 ` Xuewen Yan
2022-05-03 14:43 ` Qais Yousef
2022-05-09 2:29 ` Xuewen Yan
2022-05-10 14:56 ` Qais Yousef
2022-05-10 17:44 ` Lukasz Luba
2022-05-10 18:44 ` Qais Yousef [this message]
2022-05-10 22:03 ` Lukasz Luba
2022-05-14 15:01 ` Xuewen Yan
2022-05-14 23:55 ` Qais Yousef
2022-05-15 0:53 ` [PATCH] sched/rt: Support multi-criterion fitness search for kernel test robot
2022-05-15 1:43 ` kernel test robot
2022-05-19 14:16 ` [sched] 0eee64011b: canonical_address#:#[##] kernel test robot
2022-06-15 10:13 ` [PATCH] sched: Take thermal pressure into account when determine rt fits capacity Qais Yousef
2022-06-15 11:17 ` Xuewen Yan
2022-06-15 13:54 ` Qais Yousef
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220510184436.fdgzzcfqqevinx5p@wubuntu \
--to=qais.yousef@arm.com \
--cc=di.shen@unisoc.com \
--cc=dietmar.eggemann@arm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lukasz.luba@arm.com \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rafael@kernel.org \
--cc=rostedt@goodmis.org \
--cc=vincent.guittot@linaro.org \
--cc=viresh.kumar@linaro.org \
--cc=xuewen.yan94@gmail.com \
--cc=xuewen.yan@unisoc.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox