From: Valentin Schneider <valentin.schneider@arm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>,
Ingo Molnar <mingo@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Vincent Guittot <vincent.guittot@linaro.org>,
Morten Rasmussen <morten.rasmussen@arm.com>,
dietmar.eggemann@arm.com, patrick.bellasi@matbug.net,
lenb@kernel.org, linux-kernel@vger.kernel.org,
ionela.voinescu@arm.com, qperret@google.com,
viresh.kumar@linaro.org
Subject: Re: [PATCH] sched: Add schedutil overview
Date: Fri, 18 Dec 2020 11:33:09 +0000 [thread overview]
Message-ID: <jhjsg83s616.mognet@arm.com> (raw)
In-Reply-To: <20201218103258.GA3040@hirez.programming.kicks-ass.net>
Hi,
Have some more nits below
On 18/12/20 10:32, Peter Zijlstra wrote:
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
> Documentation/scheduler/schedutil.txt | 168 ++++++++++++++++++++++++++++++++++
> 1 file changed, 168 insertions(+)
>
> --- /dev/null
> +++ b/Documentation/scheduler/schedutil.txt
[...]
> +Frequency- / CPU Invariance
> +---------------------------
> +
> +Because consuming the CPU for 50% at 1GHz is not the same as consuming the CPU
> +for 50% at 2GHz, nor is running 50% on a LITTLE CPU the same as running 50% on
> +a big CPU, we allow architectures to scale the time delta with two ratios, one
> +Dynamic Voltage and Frequency Scaling (DVFS) ratio and one microarch ratio.
> +
> +For simple DVFS architectures (where software is in full control) we trivially
> +compute the ratio as:
> +
> + f_cur
> + r_dvfs := -----
> + f_max
> +
> +For more dynamic systems where the hardware is in control of DVFS (Intel,
> +ARMv8.4-AMU) we use hardware counters to provide us this ratio. For Intel
Nit: To me this reads as if the presence of AMUs entail 'hardware is in
control of DVFS', which doesn't seem right. How about:
For more dynamic systems where the hardware is in control of DVFS we use
hardware counters (Intel APERF/MPERF, ARMv8.4-AMU) to provide us this
ratio.
> +Schedutil / DVFS
> +----------------
> +
> +Every time the scheduler load tracking is updated (task wakeup, task
> +migration, time progression) we call out to schedutil to update the hardware
> +DVFS state.
> +
> +The basis is the CPU runqueue's 'running' metric, which per the above it is
> +the frequency invariant utilization estimate of the CPU. From this we compute
> +a desired frequency like:
> +
> + max( running, util_est ); if UTIL_EST
> + u_cfs := { running; otherwise
> +
> + u_clamp := clamp( u_cfs, u_min, u_max )
> +
> + u := u_cfs + u_rt + u_irq + u_dl; [approx. see source for more detail]
> +
> + f_des := min( f_max, 1.25 u * f_max )
> +
In schedutil_cpu_util(), uclamp clamps both u_cfs and u_rt. I'm afraid the
below might just bring more confusion; what do you think?
clamp( u_cfs + u_rt, u_min, u_max ); if UCLAMP_TASK
u_clamp := { u_cfs + u_rt; otherwise
u := u_clamp + u_irq + u_dl; [approx. see source for more detail]
(also, does this need a word about runnable rt tasks => goto max?)
> +XXX IO-wait; when the update is due to a task wakeup from IO-completion we
> +boost 'u' above.
> +
> +This frequency is then used to select a P-state/OPP or directly munged into a
> +CPPC style request to the hardware.
> +
> +XXX: deadline tasks (Sporadic Task Model) allows us to calculate a hard f_min
> +required to satisfy the workload.
> +
> +Because these callbacks are directly from the scheduler, the DVFS hardware
> +interaction should be 'fast' and non-blocking. Schedutil supports
> +rate-limiting DVFS requests for when hardware interaction is slow and
> +expensive, this reduces effectiveness.
> +
> +For more information see: kernel/sched/cpufreq_schedutil.c
> +
next prev parent reply other threads:[~2020-12-18 11:34 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-12-18 10:32 [PATCH] sched: Add schedutil overview Peter Zijlstra
2020-12-18 11:33 ` Valentin Schneider [this message]
2020-12-18 13:40 ` Morten Rasmussen
2020-12-18 14:25 ` Valentin Schneider
2021-01-14 11:29 ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=jhjsg83s616.mognet@arm.com \
--to=valentin.schneider@arm.com \
--cc=dietmar.eggemann@arm.com \
--cc=ionela.voinescu@arm.com \
--cc=lenb@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=morten.rasmussen@arm.com \
--cc=patrick.bellasi@matbug.net \
--cc=peterz@infradead.org \
--cc=qperret@google.com \
--cc=rjw@rjwysocki.net \
--cc=tglx@linutronix.de \
--cc=vincent.guittot@linaro.org \
--cc=viresh.kumar@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.