* [PATCH 0/4] consolidate and cleanup CPU capacity
@ 2023-09-01 13:03 Vincent Guittot
2023-09-01 13:03 ` [PATCH 1/4] sched: consolidate and cleanup access to CPU's max compute capacity Vincent Guittot
` (4 more replies)
0 siblings, 5 replies; 29+ messages in thread
From: Vincent Guittot @ 2023-09-01 13:03 UTC (permalink / raw)
To: linux, catalin.marinas, will, paul.walmsley, palmer, aou,
sudeep.holla, gregkh, rafael, mingo, peterz, juri.lelli,
dietmar.eggemann, rostedt, bsegall, mgorman, bristot, vschneid,
viresh.kumar, linux-arm-kernel, linux-kernel, linux-riscv,
linux-pm
Cc: conor.dooley, suagrfillet, ajones, lftan, Vincent Guittot
This is the 1st part of consolidating how the max compute capacity is
used in the scheduler and how we calculate the frequency for a level of
utilization.
Fix some unconsistancy when computing frequency for an utilization. There
can be a mismatch between energy model and schedutil.
Next step will be to make a difference between the original
max compute capacity of a CPU and what is currently available when
there is a capping applying forever (i.e. seconds or more).
Vincent Guittot (4):
sched: consolidate and cleanup access to CPU's max compute capacity
topology: add a new arch_scale_freq_reference
cpufreq/schedutil: use a fixed reference frequency
energy_model: use a fixed reference frequency
arch/arm/include/asm/topology.h | 1 +
arch/arm64/include/asm/topology.h | 1 +
arch/riscv/include/asm/topology.h | 1 +
drivers/base/arch_topology.c | 9 +++------
include/linux/arch_topology.h | 7 +++++++
include/linux/energy_model.h | 20 +++++++++++++++++---
kernel/sched/core.c | 2 +-
kernel/sched/cpudeadline.c | 2 +-
kernel/sched/cpufreq_schedutil.c | 29 +++++++++++++++++++++++++++--
kernel/sched/deadline.c | 4 ++--
kernel/sched/fair.c | 18 ++++++++----------
kernel/sched/rt.c | 2 +-
kernel/sched/sched.h | 6 ------
kernel/sched/topology.c | 7 +++++--
14 files changed, 75 insertions(+), 34 deletions(-)
--
2.34.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH 1/4] sched: consolidate and cleanup access to CPU's max compute capacity
2023-09-01 13:03 [PATCH 0/4] consolidate and cleanup CPU capacity Vincent Guittot
@ 2023-09-01 13:03 ` Vincent Guittot
2023-09-05 11:25 ` Peter Zijlstra
2023-09-14 20:45 ` Dietmar Eggemann
2023-09-01 13:03 ` [PATCH 2/4] topology: add a new arch_scale_freq_reference Vincent Guittot
` (3 subsequent siblings)
4 siblings, 2 replies; 29+ messages in thread
From: Vincent Guittot @ 2023-09-01 13:03 UTC (permalink / raw)
To: linux, catalin.marinas, will, paul.walmsley, palmer, aou,
sudeep.holla, gregkh, rafael, mingo, peterz, juri.lelli,
dietmar.eggemann, rostedt, bsegall, mgorman, bristot, vschneid,
viresh.kumar, linux-arm-kernel, linux-kernel, linux-riscv,
linux-pm
Cc: conor.dooley, suagrfillet, ajones, lftan, Vincent Guittot
Remove struct rq cpu_capacity_orig field and use arch_scale_cpu_capacity()
instead.
Scheduler uses 3 methods to get access to the CPU's max compute capacity:
- arch_scale_cpu_capacity(cpu) which is the default way to get CPU's capacity.
- cpu_capacity_orig field which is periodically updated with
arch_scale_cpu_capacity().
- capacity_orig_of(cpu) which encapsulates rq->cpu_capacity_orig
There is no real need to save the value returned by arch_scale_cpu_capacity()
in struct rq. arch_scale_cpu_capacity() returns:
- either a per_cpu variable.
- or a const value for systems which have only one capacity.
Remove cpu_capacity_orig and use arch_scale_cpu_capacity() everywhere.
No functional changes.
some tests of Arm64:
small SMP device (hikey): no noticeable changes
HMP device (RB5): hackbench shows minor improvement (1-2%)
large smp (thx2): hackbench and tbench shows minor improvement (1%)
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---
kernel/sched/core.c | 2 +-
kernel/sched/cpudeadline.c | 2 +-
kernel/sched/deadline.c | 4 ++--
kernel/sched/fair.c | 18 ++++++++----------
kernel/sched/rt.c | 2 +-
kernel/sched/sched.h | 6 ------
kernel/sched/topology.c | 7 +++++--
7 files changed, 18 insertions(+), 23 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index efe3848978a0..6560392f2f83 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -10014,7 +10014,7 @@ void __init sched_init(void)
#ifdef CONFIG_SMP
rq->sd = NULL;
rq->rd = NULL;
- rq->cpu_capacity = rq->cpu_capacity_orig = SCHED_CAPACITY_SCALE;
+ rq->cpu_capacity = SCHED_CAPACITY_SCALE;
rq->balance_callback = &balance_push_callback;
rq->active_balance = 0;
rq->next_balance = jiffies;
diff --git a/kernel/sched/cpudeadline.c b/kernel/sched/cpudeadline.c
index 57c92d751bcd..95baa12a1029 100644
--- a/kernel/sched/cpudeadline.c
+++ b/kernel/sched/cpudeadline.c
@@ -131,7 +131,7 @@ int cpudl_find(struct cpudl *cp, struct task_struct *p,
if (!dl_task_fits_capacity(p, cpu)) {
cpumask_clear_cpu(cpu, later_mask);
- cap = capacity_orig_of(cpu);
+ cap = arch_scale_cpu_capacity(cpu);
if (cap > max_cap ||
(cpu == task_cpu(p) && cap == max_cap)) {
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 58b542bf2893..c57ef2e0db41 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -132,7 +132,7 @@ static inline unsigned long __dl_bw_capacity(const struct cpumask *mask)
int i;
for_each_cpu_and(i, mask, cpu_active_mask)
- cap += capacity_orig_of(i);
+ cap += arch_scale_cpu_capacity(i);
return cap;
}
@@ -144,7 +144,7 @@ static inline unsigned long __dl_bw_capacity(const struct cpumask *mask)
static inline unsigned long dl_bw_capacity(int i)
{
if (!sched_asym_cpucap_active() &&
- capacity_orig_of(i) == SCHED_CAPACITY_SCALE) {
+ arch_scale_cpu_capacity(i) == SCHED_CAPACITY_SCALE) {
return dl_bw_cpus(i) << SCHED_CAPACITY_SHIFT;
} else {
RCU_LOCKDEP_WARN(!rcu_read_lock_sched_held(),
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 0b7445cd5af9..06d6d0dde48a 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4690,7 +4690,7 @@ static inline void util_est_update(struct cfs_rq *cfs_rq,
* To avoid overestimation of actual task utilization, skip updates if
* we cannot grant there is idle time in this CPU.
*/
- if (task_util(p) > capacity_orig_of(cpu_of(rq_of(cfs_rq))))
+ if (task_util(p) > arch_scale_cpu_capacity(cpu_of(rq_of(cfs_rq))))
return;
/*
@@ -4738,14 +4738,14 @@ static inline int util_fits_cpu(unsigned long util,
return fits;
/*
- * We must use capacity_orig_of() for comparing against uclamp_min and
+ * We must use arch_scale_cpu_capacity() for comparing against uclamp_min and
* uclamp_max. We only care about capacity pressure (by using
* capacity_of()) for comparing against the real util.
*
* If a task is boosted to 1024 for example, we don't want a tiny
* pressure to skew the check whether it fits a CPU or not.
*
- * Similarly if a task is capped to capacity_orig_of(little_cpu), it
+ * Similarly if a task is capped to arch_scale_cpu_capacity(little_cpu), it
* should fit a little cpu even if there's some pressure.
*
* Only exception is for thermal pressure since it has a direct impact
@@ -4757,7 +4757,7 @@ static inline int util_fits_cpu(unsigned long util,
* For uclamp_max, we can tolerate a drop in performance level as the
* goal is to cap the task. So it's okay if it's getting less.
*/
- capacity_orig = capacity_orig_of(cpu);
+ capacity_orig = arch_scale_cpu_capacity(cpu);
capacity_orig_thermal = capacity_orig - arch_scale_thermal_pressure(cpu);
/*
@@ -7226,7 +7226,7 @@ select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int target)
* Look for the CPU with best capacity.
*/
else if (fits < 0)
- cpu_cap = capacity_orig_of(cpu) - thermal_load_avg(cpu_rq(cpu));
+ cpu_cap = arch_scale_cpu_capacity(cpu) - thermal_load_avg(cpu_rq(cpu));
/*
* First, select CPU which fits better (-1 being better than 0).
@@ -7468,7 +7468,7 @@ cpu_util(int cpu, struct task_struct *p, int dst_cpu, int boost)
util = max(util, util_est);
}
- return min(util, capacity_orig_of(cpu));
+ return min(util, arch_scale_cpu_capacity(cpu));
}
unsigned long cpu_util_cfs(int cpu)
@@ -9251,8 +9251,6 @@ static void update_cpu_capacity(struct sched_domain *sd, int cpu)
unsigned long capacity = scale_rt_capacity(cpu);
struct sched_group *sdg = sd->groups;
- cpu_rq(cpu)->cpu_capacity_orig = arch_scale_cpu_capacity(cpu);
-
if (!capacity)
capacity = 1;
@@ -9328,7 +9326,7 @@ static inline int
check_cpu_capacity(struct rq *rq, struct sched_domain *sd)
{
return ((rq->cpu_capacity * sd->imbalance_pct) <
- (rq->cpu_capacity_orig * 100));
+ (arch_scale_cpu_capacity(cpu_of(rq)) * 100));
}
/*
@@ -9339,7 +9337,7 @@ check_cpu_capacity(struct rq *rq, struct sched_domain *sd)
static inline int check_misfit_status(struct rq *rq, struct sched_domain *sd)
{
return rq->misfit_task_load &&
- (rq->cpu_capacity_orig < rq->rd->max_cpu_capacity ||
+ (arch_scale_cpu_capacity(rq->cpu) < rq->rd->max_cpu_capacity ||
check_cpu_capacity(rq, sd));
}
diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 0597ba0f85ff..8f4e8db6e234 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -515,7 +515,7 @@ static inline bool rt_task_fits_capacity(struct task_struct *p, int cpu)
min_cap = uclamp_eff_value(p, UCLAMP_MIN);
max_cap = uclamp_eff_value(p, UCLAMP_MAX);
- cpu_cap = capacity_orig_of(cpu);
+ cpu_cap = arch_scale_cpu_capacity(cpu);
return cpu_cap >= min(min_cap, max_cap);
}
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 3a01b7a2bf66..17ae151e90c0 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1048,7 +1048,6 @@ struct rq {
struct sched_domain __rcu *sd;
unsigned long cpu_capacity;
- unsigned long cpu_capacity_orig;
struct balance_callback *balance_callback;
@@ -2974,11 +2973,6 @@ static inline void cpufreq_update_util(struct rq *rq, unsigned int flags) {}
#endif
#ifdef CONFIG_SMP
-static inline unsigned long capacity_orig_of(int cpu)
-{
- return cpu_rq(cpu)->cpu_capacity_orig;
-}
-
/**
* enum cpu_util_type - CPU utilization type
* @FREQUENCY_UTIL: Utilization used to select frequency
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 05a5bc678c08..e6b0b6a8e60a 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -2479,12 +2479,15 @@ build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *att
/* Attach the domains */
rcu_read_lock();
for_each_cpu(i, cpu_map) {
+ unsigned long capacity;
+
rq = cpu_rq(i);
sd = *per_cpu_ptr(d.sd, i);
+ capacity = arch_scale_cpu_capacity(i);
/* Use READ_ONCE()/WRITE_ONCE() to avoid load/store tearing: */
- if (rq->cpu_capacity_orig > READ_ONCE(d.rd->max_cpu_capacity))
- WRITE_ONCE(d.rd->max_cpu_capacity, rq->cpu_capacity_orig);
+ if (capacity > READ_ONCE(d.rd->max_cpu_capacity))
+ WRITE_ONCE(d.rd->max_cpu_capacity, capacity);
cpu_attach_domain(sd, d.rd, i);
}
--
2.34.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 2/4] topology: add a new arch_scale_freq_reference
2023-09-01 13:03 [PATCH 0/4] consolidate and cleanup CPU capacity Vincent Guittot
2023-09-01 13:03 ` [PATCH 1/4] sched: consolidate and cleanup access to CPU's max compute capacity Vincent Guittot
@ 2023-09-01 13:03 ` Vincent Guittot
2023-09-04 12:35 ` Lukasz Luba
` (2 more replies)
2023-09-01 13:03 ` [PATCH 3/4] cpufreq/schedutil: use a fixed reference frequency Vincent Guittot
` (2 subsequent siblings)
4 siblings, 3 replies; 29+ messages in thread
From: Vincent Guittot @ 2023-09-01 13:03 UTC (permalink / raw)
To: linux, catalin.marinas, will, paul.walmsley, palmer, aou,
sudeep.holla, gregkh, rafael, mingo, peterz, juri.lelli,
dietmar.eggemann, rostedt, bsegall, mgorman, bristot, vschneid,
viresh.kumar, linux-arm-kernel, linux-kernel, linux-riscv,
linux-pm
Cc: conor.dooley, suagrfillet, ajones, lftan, Vincent Guittot
Create a new method to get a unique and fixed max frequency. Currently
cpuinfo.max_freq or last item of performance domain are used as the max
frequency when computing the frequency for a level of utilization but:
- cpuinfo_max_freq can change at runtime. boost is one example of
such change.
- cpuinfo.max_freq and last item of the PD can be different leading to
different results betwen cpufreq and energy model.
We need to save the max frequency that has been used when computing the
CPUs capacity and use this fixed and coherent value to convert between
frequency and CPU's capacity.
In fact, we already save the frequency that has been used when computing
the capacity of each CPU. We extend the precision to save khZ instead of
Mhz currently and we modify the type to be aligned with other variables
used when converting frequency to capacity and the other way.
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---
arch/arm/include/asm/topology.h | 1 +
arch/arm64/include/asm/topology.h | 1 +
arch/riscv/include/asm/topology.h | 1 +
drivers/base/arch_topology.c | 9 +++------
include/linux/arch_topology.h | 7 +++++++
5 files changed, 13 insertions(+), 6 deletions(-)
diff --git a/arch/arm/include/asm/topology.h b/arch/arm/include/asm/topology.h
index c7d2510e5a78..853c4f81ba4a 100644
--- a/arch/arm/include/asm/topology.h
+++ b/arch/arm/include/asm/topology.h
@@ -13,6 +13,7 @@
#define arch_set_freq_scale topology_set_freq_scale
#define arch_scale_freq_capacity topology_get_freq_scale
#define arch_scale_freq_invariant topology_scale_freq_invariant
+#define arch_scale_freq_ref topology_get_freq_ref
#endif
/* Replace task scheduler's default cpu-invariant accounting */
diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h
index 9fab663dd2de..a323b109b9c4 100644
--- a/arch/arm64/include/asm/topology.h
+++ b/arch/arm64/include/asm/topology.h
@@ -23,6 +23,7 @@ void update_freq_counters_refs(void);
#define arch_set_freq_scale topology_set_freq_scale
#define arch_scale_freq_capacity topology_get_freq_scale
#define arch_scale_freq_invariant topology_scale_freq_invariant
+#define arch_scale_freq_ref topology_get_freq_ref
#ifdef CONFIG_ACPI_CPPC_LIB
#define arch_init_invariance_cppc topology_init_cpu_capacity_cppc
diff --git a/arch/riscv/include/asm/topology.h b/arch/riscv/include/asm/topology.h
index e316ab3b77f3..61183688bdd5 100644
--- a/arch/riscv/include/asm/topology.h
+++ b/arch/riscv/include/asm/topology.h
@@ -9,6 +9,7 @@
#define arch_set_freq_scale topology_set_freq_scale
#define arch_scale_freq_capacity topology_get_freq_scale
#define arch_scale_freq_invariant topology_scale_freq_invariant
+#define arch_scale_freq_ref topology_get_freq_ref
/* Replace task scheduler's default cpu-invariant accounting */
#define arch_scale_cpu_capacity topology_get_cpu_scale
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index b741b5ba82bd..75fa67477a9d 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -26,7 +26,7 @@
static DEFINE_PER_CPU(struct scale_freq_data __rcu *, sft_data);
static struct cpumask scale_freq_counters_mask;
static bool scale_freq_invariant;
-static DEFINE_PER_CPU(u32, freq_factor) = 1;
+DEFINE_PER_CPU(unsigned long, freq_factor) = 1;
static bool supports_scale_freq_counters(const struct cpumask *cpus)
{
@@ -183,10 +183,7 @@ void topology_update_thermal_pressure(const struct cpumask *cpus,
cpu = cpumask_first(cpus);
max_capacity = arch_scale_cpu_capacity(cpu);
- max_freq = per_cpu(freq_factor, cpu);
-
- /* Convert to MHz scale which is used in 'freq_factor' */
- capped_freq /= 1000;
+ max_freq = arch_scale_freq_ref(cpu);
/*
* Handle properly the boost frequencies, which should simply clean
@@ -411,7 +408,7 @@ init_cpu_capacity_callback(struct notifier_block *nb,
cpumask_andnot(cpus_to_visit, cpus_to_visit, policy->related_cpus);
for_each_cpu(cpu, policy->related_cpus)
- per_cpu(freq_factor, cpu) = policy->cpuinfo.max_freq / 1000;
+ per_cpu(freq_factor, cpu) = policy->cpuinfo.max_freq;
if (cpumask_empty(cpus_to_visit)) {
topology_normalize_cpu_scale();
diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
index a07b510e7dc5..7a2dba9c3dc0 100644
--- a/include/linux/arch_topology.h
+++ b/include/linux/arch_topology.h
@@ -27,6 +27,13 @@ static inline unsigned long topology_get_cpu_scale(int cpu)
void topology_set_cpu_scale(unsigned int cpu, unsigned long capacity);
+DECLARE_PER_CPU(unsigned long, freq_factor);
+
+static inline unsigned long topology_get_freq_ref(int cpu)
+{
+ return per_cpu(freq_factor, cpu);
+}
+
DECLARE_PER_CPU(unsigned long, arch_freq_scale);
static inline unsigned long topology_get_freq_scale(int cpu)
--
2.34.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 3/4] cpufreq/schedutil: use a fixed reference frequency
2023-09-01 13:03 [PATCH 0/4] consolidate and cleanup CPU capacity Vincent Guittot
2023-09-01 13:03 ` [PATCH 1/4] sched: consolidate and cleanup access to CPU's max compute capacity Vincent Guittot
2023-09-01 13:03 ` [PATCH 2/4] topology: add a new arch_scale_freq_reference Vincent Guittot
@ 2023-09-01 13:03 ` Vincent Guittot
2023-09-02 10:57 ` Conor Dooley
` (2 more replies)
2023-09-01 13:03 ` [PATCH 4/4] energy_model: " Vincent Guittot
2023-09-21 10:12 ` [PATCH 0/4] consolidate and cleanup CPU capacity Ionela Voinescu
4 siblings, 3 replies; 29+ messages in thread
From: Vincent Guittot @ 2023-09-01 13:03 UTC (permalink / raw)
To: linux, catalin.marinas, will, paul.walmsley, palmer, aou,
sudeep.holla, gregkh, rafael, mingo, peterz, juri.lelli,
dietmar.eggemann, rostedt, bsegall, mgorman, bristot, vschneid,
viresh.kumar, linux-arm-kernel, linux-kernel, linux-riscv,
linux-pm
Cc: conor.dooley, suagrfillet, ajones, lftan, Vincent Guittot
cpuinfo_max_freq can change at runtime because of boost as example. This
implies that the value could not be the same than the one that has been
used when computing the capacity of a CPU.
The new arch_scale_freq_ref() returns a fixed and coherent frequency
reference that can be used when computing a frequency based on utilization.
Use this arch_scale_freq_ref() when available and fallback to
cpuinfo.max_freq otherwise.
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---
kernel/sched/cpufreq_schedutil.c | 29 +++++++++++++++++++++++++++--
1 file changed, 27 insertions(+), 2 deletions(-)
diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index 4492608b7d7f..9996ef429e2b 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -114,6 +114,31 @@ static void sugov_deferred_update(struct sugov_policy *sg_policy)
}
}
+#ifdef arch_scale_freq_ref
+/**
+ * arch_scale_freq_ref_policy - get the reference frequency of a given CPU that
+ * has been used to correlate frequency and compute capacity.
+ * @cpu: the CPU in question.
+ *
+ * Return: the reference CPU frequency.
+ */
+static __always_inline
+unsigned long arch_scale_freq_ref_policy(struct cpufreq_policy *policy)
+{
+ return arch_scale_freq_ref(policy->cpu);
+}
+#else
+static __always_inline
+unsigned long arch_scale_freq_ref_policy(struct cpufreq_policy *policy)
+{
+ if (arch_scale_freq_invariant())
+ return policy->cpuinfo.max_freq;
+
+
+ return policy->cur;
+}
+#endif
+
/**
* get_next_freq - Compute a new frequency for a given cpufreq policy.
* @sg_policy: schedutil policy object to compute the new frequency for.
@@ -139,11 +164,11 @@ static void sugov_deferred_update(struct sugov_policy *sg_policy)
static unsigned int get_next_freq(struct sugov_policy *sg_policy,
unsigned long util, unsigned long max)
{
+ unsigned int freq;
struct cpufreq_policy *policy = sg_policy->policy;
- unsigned int freq = arch_scale_freq_invariant() ?
- policy->cpuinfo.max_freq : policy->cur;
util = map_util_perf(util);
+ freq = arch_scale_freq_ref_policy(policy);
freq = map_util_freq(util, freq, max);
if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update)
--
2.34.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH 4/4] energy_model: use a fixed reference frequency
2023-09-01 13:03 [PATCH 0/4] consolidate and cleanup CPU capacity Vincent Guittot
` (2 preceding siblings ...)
2023-09-01 13:03 ` [PATCH 3/4] cpufreq/schedutil: use a fixed reference frequency Vincent Guittot
@ 2023-09-01 13:03 ` Vincent Guittot
2023-09-04 12:40 ` Lukasz Luba
` (3 more replies)
2023-09-21 10:12 ` [PATCH 0/4] consolidate and cleanup CPU capacity Ionela Voinescu
4 siblings, 4 replies; 29+ messages in thread
From: Vincent Guittot @ 2023-09-01 13:03 UTC (permalink / raw)
To: linux, catalin.marinas, will, paul.walmsley, palmer, aou,
sudeep.holla, gregkh, rafael, mingo, peterz, juri.lelli,
dietmar.eggemann, rostedt, bsegall, mgorman, bristot, vschneid,
viresh.kumar, linux-arm-kernel, linux-kernel, linux-riscv,
linux-pm
Cc: conor.dooley, suagrfillet, ajones, lftan, Vincent Guittot
The last item of a performance domain is not always the performance point
that has been used to compute CPU's capacity. This can lead to different
target frequency compared with other part of the system like schedutil and
would result in wrong energy estimation.
a new arch_scale_freq_ref() is available to return a fixed and coherent
frequency reference that can be used when computing the CPU's frequency
for an level of utilization. Use this function when available or fallback
to the last performance domain item otherwise.
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---
include/linux/energy_model.h | 20 +++++++++++++++++---
1 file changed, 17 insertions(+), 3 deletions(-)
diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h
index b9caa01dfac4..7ee07be6928e 100644
--- a/include/linux/energy_model.h
+++ b/include/linux/energy_model.h
@@ -204,6 +204,20 @@ struct em_perf_state *em_pd_get_efficient_state(struct em_perf_domain *pd,
return ps;
}
+#ifdef arch_scale_freq_ref
+static __always_inline
+unsigned long arch_scale_freq_ref_em(int cpu, struct em_perf_domain *pd)
+{
+ return arch_scale_freq_ref(cpu);
+}
+#else
+static __always_inline
+unsigned long arch_scale_freq_ref_em(int cpu, struct em_perf_domain *pd)
+{
+ return pd->table[pd->nr_perf_states - 1].frequency;
+}
+#endif
+
/**
* em_cpu_energy() - Estimates the energy consumed by the CPUs of a
* performance domain
@@ -224,7 +238,7 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd,
unsigned long max_util, unsigned long sum_util,
unsigned long allowed_cpu_cap)
{
- unsigned long freq, scale_cpu;
+ unsigned long freq, ref_freq, scale_cpu;
struct em_perf_state *ps;
int cpu;
@@ -241,11 +255,11 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd,
*/
cpu = cpumask_first(to_cpumask(pd->cpus));
scale_cpu = arch_scale_cpu_capacity(cpu);
- ps = &pd->table[pd->nr_perf_states - 1];
+ ref_freq = arch_scale_freq_ref_em(cpu, pd);
max_util = map_util_perf(max_util);
max_util = min(max_util, allowed_cpu_cap);
- freq = map_util_freq(max_util, ps->frequency, scale_cpu);
+ freq = map_util_freq(max_util, ref_freq, scale_cpu);
/*
* Find the lowest performance state of the Energy Model above the
--
2.34.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH 3/4] cpufreq/schedutil: use a fixed reference frequency
2023-09-01 13:03 ` [PATCH 3/4] cpufreq/schedutil: use a fixed reference frequency Vincent Guittot
@ 2023-09-02 10:57 ` Conor Dooley
2023-09-02 12:49 ` Vincent Guittot
2023-09-05 11:24 ` Peter Zijlstra
2023-09-21 9:19 ` Ionela Voinescu
2 siblings, 1 reply; 29+ messages in thread
From: Conor Dooley @ 2023-09-02 10:57 UTC (permalink / raw)
To: Vincent Guittot
Cc: linux, catalin.marinas, will, paul.walmsley, palmer, aou,
sudeep.holla, gregkh, rafael, mingo, peterz, juri.lelli,
dietmar.eggemann, rostedt, bsegall, mgorman, bristot, vschneid,
viresh.kumar, linux-arm-kernel, linux-kernel, linux-riscv,
linux-pm, conor.dooley, suagrfillet, ajones, lftan
[-- Attachment #1.1: Type: text/plain, Size: 505 bytes --]
On Fri, Sep 01, 2023 at 03:03:11PM +0200, Vincent Guittot wrote:
> +#ifdef arch_scale_freq_ref
> +/**
> + * arch_scale_freq_ref_policy - get the reference frequency of a given CPU that
> + * has been used to correlate frequency and compute capacity.
> + * @cpu: the CPU in question.
Copy-pasting fail? That's not what your parameter is called.
> + *
> + * Return: the reference CPU frequency.
> + */
> +static __always_inline
> +unsigned long arch_scale_freq_ref_policy(struct cpufreq_policy *policy)
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
[-- Attachment #2: Type: text/plain, Size: 176 bytes --]
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 3/4] cpufreq/schedutil: use a fixed reference frequency
2023-09-02 10:57 ` Conor Dooley
@ 2023-09-02 12:49 ` Vincent Guittot
0 siblings, 0 replies; 29+ messages in thread
From: Vincent Guittot @ 2023-09-02 12:49 UTC (permalink / raw)
To: Conor Dooley
Cc: linux, catalin.marinas, will, paul.walmsley, palmer, aou,
sudeep.holla, gregkh, rafael, mingo, peterz, juri.lelli,
dietmar.eggemann, rostedt, bsegall, mgorman, bristot, vschneid,
viresh.kumar, linux-arm-kernel, linux-kernel, linux-riscv,
linux-pm, conor.dooley, suagrfillet, ajones, lftan
On Sat, 2 Sept 2023 at 12:57, Conor Dooley <conor@kernel.org> wrote:
>
> On Fri, Sep 01, 2023 at 03:03:11PM +0200, Vincent Guittot wrote:
>
> > +#ifdef arch_scale_freq_ref
> > +/**
> > + * arch_scale_freq_ref_policy - get the reference frequency of a given CPU that
> > + * has been used to correlate frequency and compute capacity.
> > + * @cpu: the CPU in question.
>
> Copy-pasting fail? That's not what your parameter is called.
Yes, I forgot to change parameter and its description
>
> > + *
> > + * Return: the reference CPU frequency.
> > + */
> > +static __always_inline
> > +unsigned long arch_scale_freq_ref_policy(struct cpufreq_policy *policy)
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 2/4] topology: add a new arch_scale_freq_reference
2023-09-01 13:03 ` [PATCH 2/4] topology: add a new arch_scale_freq_reference Vincent Guittot
@ 2023-09-04 12:35 ` Lukasz Luba
2023-09-18 11:23 ` Vincent Guittot
2023-09-14 21:01 ` Dietmar Eggemann
2023-09-21 9:00 ` Ionela Voinescu
2 siblings, 1 reply; 29+ messages in thread
From: Lukasz Luba @ 2023-09-04 12:35 UTC (permalink / raw)
To: Vincent Guittot
Cc: conor.dooley, linux-arm-kernel, suagrfillet, juri.lelli,
sudeep.holla, palmer, linux-riscv, ajones, lftan, linux-kernel,
linux-pm, linux, bristot, catalin.marinas, bsegall, will,
vschneid, rostedt, rafael, dietmar.eggemann, aou, mingo,
paul.walmsley, mgorman, gregkh, peterz, viresh.kumar
Hi Vincent,
On 9/1/23 14:03, Vincent Guittot wrote:
> Create a new method to get a unique and fixed max frequency. Currently
> cpuinfo.max_freq or last item of performance domain are used as the max
> frequency when computing the frequency for a level of utilization but:
> - cpuinfo_max_freq can change at runtime. boost is one example of
> such change.
> - cpuinfo.max_freq and last item of the PD can be different leading to
> different results betwen cpufreq and energy model.
>
> We need to save the max frequency that has been used when computing the
> CPUs capacity and use this fixed and coherent value to convert between
> frequency and CPU's capacity.
>
> In fact, we already save the frequency that has been used when computing
> the capacity of each CPU. We extend the precision to save khZ instead of
> Mhz currently and we modify the type to be aligned with other variables
> used when converting frequency to capacity and the other way.
I do like this 'kHz' change. We also use kHz in the EM, so better
aligned now.
>
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> ---
> arch/arm/include/asm/topology.h | 1 +
> arch/arm64/include/asm/topology.h | 1 +
> arch/riscv/include/asm/topology.h | 1 +
> drivers/base/arch_topology.c | 9 +++------
> include/linux/arch_topology.h | 7 +++++++
> 5 files changed, 13 insertions(+), 6 deletions(-)
>
> diff --git a/arch/arm/include/asm/topology.h b/arch/arm/include/asm/topology.h
> index c7d2510e5a78..853c4f81ba4a 100644
> --- a/arch/arm/include/asm/topology.h
> +++ b/arch/arm/include/asm/topology.h
> @@ -13,6 +13,7 @@
> #define arch_set_freq_scale topology_set_freq_scale
> #define arch_scale_freq_capacity topology_get_freq_scale
> #define arch_scale_freq_invariant topology_scale_freq_invariant
> +#define arch_scale_freq_ref topology_get_freq_ref
> #endif
>
> /* Replace task scheduler's default cpu-invariant accounting */
> diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h
> index 9fab663dd2de..a323b109b9c4 100644
> --- a/arch/arm64/include/asm/topology.h
> +++ b/arch/arm64/include/asm/topology.h
> @@ -23,6 +23,7 @@ void update_freq_counters_refs(void);
> #define arch_set_freq_scale topology_set_freq_scale
> #define arch_scale_freq_capacity topology_get_freq_scale
> #define arch_scale_freq_invariant topology_scale_freq_invariant
> +#define arch_scale_freq_ref topology_get_freq_ref
>
> #ifdef CONFIG_ACPI_CPPC_LIB
> #define arch_init_invariance_cppc topology_init_cpu_capacity_cppc
> diff --git a/arch/riscv/include/asm/topology.h b/arch/riscv/include/asm/topology.h
> index e316ab3b77f3..61183688bdd5 100644
> --- a/arch/riscv/include/asm/topology.h
> +++ b/arch/riscv/include/asm/topology.h
> @@ -9,6 +9,7 @@
> #define arch_set_freq_scale topology_set_freq_scale
> #define arch_scale_freq_capacity topology_get_freq_scale
> #define arch_scale_freq_invariant topology_scale_freq_invariant
> +#define arch_scale_freq_ref topology_get_freq_ref
>
> /* Replace task scheduler's default cpu-invariant accounting */
> #define arch_scale_cpu_capacity topology_get_cpu_scale
> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index b741b5ba82bd..75fa67477a9d 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -26,7 +26,7 @@
> static DEFINE_PER_CPU(struct scale_freq_data __rcu *, sft_data);
> static struct cpumask scale_freq_counters_mask;
> static bool scale_freq_invariant;
> -static DEFINE_PER_CPU(u32, freq_factor) = 1;
> +DEFINE_PER_CPU(unsigned long, freq_factor) = 1;
Why it's not static now?
>
> static bool supports_scale_freq_counters(const struct cpumask *cpus)
> {
> @@ -183,10 +183,7 @@ void topology_update_thermal_pressure(const struct cpumask *cpus,
>
> cpu = cpumask_first(cpus);
> max_capacity = arch_scale_cpu_capacity(cpu);
> - max_freq = per_cpu(freq_factor, cpu);
> -
> - /* Convert to MHz scale which is used in 'freq_factor' */
> - capped_freq /= 1000;
> + max_freq = arch_scale_freq_ref(cpu);
>
> /*
> * Handle properly the boost frequencies, which should simply clean
> @@ -411,7 +408,7 @@ init_cpu_capacity_callback(struct notifier_block *nb,
> cpumask_andnot(cpus_to_visit, cpus_to_visit, policy->related_cpus);
>
> for_each_cpu(cpu, policy->related_cpus)
> - per_cpu(freq_factor, cpu) = policy->cpuinfo.max_freq / 1000;
> + per_cpu(freq_factor, cpu) = policy->cpuinfo.max_freq;
>
> if (cpumask_empty(cpus_to_visit)) {
> topology_normalize_cpu_scale();
> diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
> index a07b510e7dc5..7a2dba9c3dc0 100644
> --- a/include/linux/arch_topology.h
> +++ b/include/linux/arch_topology.h
> @@ -27,6 +27,13 @@ static inline unsigned long topology_get_cpu_scale(int cpu)
>
> void topology_set_cpu_scale(unsigned int cpu, unsigned long capacity);
>
> +DECLARE_PER_CPU(unsigned long, freq_factor);
> +
> +static inline unsigned long topology_get_freq_ref(int cpu)
> +{
> + return per_cpu(freq_factor, cpu);
> +}
> +
> DECLARE_PER_CPU(unsigned long, arch_freq_scale);
>
> static inline unsigned long topology_get_freq_scale(int cpu)
Apart from that 'static' missing, that looks good.
Regards,
Lukasz
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 4/4] energy_model: use a fixed reference frequency
2023-09-01 13:03 ` [PATCH 4/4] energy_model: " Vincent Guittot
@ 2023-09-04 12:40 ` Lukasz Luba
2023-09-05 10:05 ` Pierre Gondois
` (2 subsequent siblings)
3 siblings, 0 replies; 29+ messages in thread
From: Lukasz Luba @ 2023-09-04 12:40 UTC (permalink / raw)
To: Vincent Guittot
Cc: conor.dooley, suagrfillet, ajones, catalin.marinas, lftan, will,
juri.lelli, peterz, mingo, sudeep.holla, rostedt, vschneid, linux,
linux-riscv, mgorman, bsegall, dietmar.eggemann, linux-pm,
bristot, rafael, gregkh, viresh.kumar, palmer, paul.walmsley,
linux-kernel, linux-arm-kernel, aou
On 9/1/23 14:03, Vincent Guittot wrote:
> The last item of a performance domain is not always the performance point
> that has been used to compute CPU's capacity. This can lead to different
> target frequency compared with other part of the system like schedutil and
> would result in wrong energy estimation.
>
> a new arch_scale_freq_ref() is available to return a fixed and coherent
> frequency reference that can be used when computing the CPU's frequency
> for an level of utilization. Use this function when available or fallback
> to the last performance domain item otherwise.
>
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> ---
> include/linux/energy_model.h | 20 +++++++++++++++++---
> 1 file changed, 17 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h
> index b9caa01dfac4..7ee07be6928e 100644
> --- a/include/linux/energy_model.h
> +++ b/include/linux/energy_model.h
> @@ -204,6 +204,20 @@ struct em_perf_state *em_pd_get_efficient_state(struct em_perf_domain *pd,
> return ps;
> }
>
> +#ifdef arch_scale_freq_ref
> +static __always_inline
> +unsigned long arch_scale_freq_ref_em(int cpu, struct em_perf_domain *pd)
> +{
> + return arch_scale_freq_ref(cpu);
> +}
> +#else
> +static __always_inline
> +unsigned long arch_scale_freq_ref_em(int cpu, struct em_perf_domain *pd)
> +{
> + return pd->table[pd->nr_perf_states - 1].frequency;
> +}
> +#endif
> +
> /**
> * em_cpu_energy() - Estimates the energy consumed by the CPUs of a
> * performance domain
> @@ -224,7 +238,7 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd,
> unsigned long max_util, unsigned long sum_util,
> unsigned long allowed_cpu_cap)
> {
> - unsigned long freq, scale_cpu;
> + unsigned long freq, ref_freq, scale_cpu;
> struct em_perf_state *ps;
> int cpu;
>
> @@ -241,11 +255,11 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd,
> */
> cpu = cpumask_first(to_cpumask(pd->cpus));
> scale_cpu = arch_scale_cpu_capacity(cpu);
> - ps = &pd->table[pd->nr_perf_states - 1];
> + ref_freq = arch_scale_freq_ref_em(cpu, pd);
>
> max_util = map_util_perf(max_util);
> max_util = min(max_util, allowed_cpu_cap);
> - freq = map_util_freq(max_util, ps->frequency, scale_cpu);
> + freq = map_util_freq(max_util, ref_freq, scale_cpu);
>
> /*
> * Find the lowest performance state of the Energy Model above the
LGTM,
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
FYI, I'm going to send my v4 for the EM hopefully in next days, so those
changes might collide. But we can sort this out later (when both would
be ready).
Regards,
Lukasz
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 4/4] energy_model: use a fixed reference frequency
2023-09-01 13:03 ` [PATCH 4/4] energy_model: " Vincent Guittot
2023-09-04 12:40 ` Lukasz Luba
@ 2023-09-05 10:05 ` Pierre Gondois
2023-09-05 11:33 ` Peter Zijlstra
2023-09-05 13:16 ` Vincent Guittot
2023-09-14 21:07 ` Dietmar Eggemann
2023-09-21 10:12 ` Ionela Voinescu
3 siblings, 2 replies; 29+ messages in thread
From: Pierre Gondois @ 2023-09-05 10:05 UTC (permalink / raw)
To: Vincent Guittot, linux, catalin.marinas, will, paul.walmsley,
palmer, aou, sudeep.holla, gregkh, rafael, mingo, peterz,
juri.lelli, dietmar.eggemann, rostedt, bsegall, mgorman, bristot,
vschneid, viresh.kumar, linux-arm-kernel, linux-kernel,
linux-riscv, linux-pm
Cc: conor.dooley, suagrfillet, ajones, lftan
Hello Vincent,
I tried the patch-set on a platform using cppc_cpufreq and that has boosting
frequencies,
1-
On such platform, the CPU capacity comes from the CPPC highest_frequency
field. The CPU capacity is set to the capacity of the boosting frequency.
This behaviour is different from DT platforms where the CPU capacity is
updated whenever the boosting mode is enabled (it seems).
Wouldn't it be better to have CPU max capacities set to their boosting
capacity as for CPPC base platforms ? It seems the max frequency is always
available somehow for all the cpufreq drivers with boosting available, i.e.
acpi-cpufreq, amd-pstate, cppc_cpufreq.
2-
On the CPPC based platforms, the per_cpu freq_factor is not used/updated,
meaning that we have:
arch_scale_freq_ref_em()
\-arch_scale_freq_ref()
\-topology_get_freq_ref()
\-per_cpu(freq_factor, cpu) (set to the default value: 1)
and em_cpu_energy()'s ref_freq variable is then set to 1 instead of the max
frequency (leading to a 0 energy computation).
3-
Also just in case, arch_scale_freq_ref_policy() and cpufreq_get_hw_max_freq()
seem to have close (but not identical) purpose,
Regards,
Pierre
On 9/1/23 15:03, Vincent Guittot wrote:
> The last item of a performance domain is not always the performance point
> that has been used to compute CPU's capacity. This can lead to different
> target frequency compared with other part of the system like schedutil and
> would result in wrong energy estimation.
>
> a new arch_scale_freq_ref() is available to return a fixed and coherent
> frequency reference that can be used when computing the CPU's frequency
> for an level of utilization. Use this function when available or fallback
> to the last performance domain item otherwise.
>
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> ---
> include/linux/energy_model.h | 20 +++++++++++++++++---
> 1 file changed, 17 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h
> index b9caa01dfac4..7ee07be6928e 100644
> --- a/include/linux/energy_model.h
> +++ b/include/linux/energy_model.h
> @@ -204,6 +204,20 @@ struct em_perf_state *em_pd_get_efficient_state(struct em_perf_domain *pd,
> return ps;
> }
>
> +#ifdef arch_scale_freq_ref
> +static __always_inline
> +unsigned long arch_scale_freq_ref_em(int cpu, struct em_perf_domain *pd)
> +{
> + return arch_scale_freq_ref(cpu);
> +}
> +#else
> +static __always_inline
> +unsigned long arch_scale_freq_ref_em(int cpu, struct em_perf_domain *pd)
> +{
> + return pd->table[pd->nr_perf_states - 1].frequency;
> +}
> +#endif
> +
> /**
> * em_cpu_energy() - Estimates the energy consumed by the CPUs of a
> * performance domain
> @@ -224,7 +238,7 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd,
> unsigned long max_util, unsigned long sum_util,
> unsigned long allowed_cpu_cap)
> {
> - unsigned long freq, scale_cpu;
> + unsigned long freq, ref_freq, scale_cpu;
> struct em_perf_state *ps;
> int cpu;
>
> @@ -241,11 +255,11 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd,
> */
> cpu = cpumask_first(to_cpumask(pd->cpus));
> scale_cpu = arch_scale_cpu_capacity(cpu);
> - ps = &pd->table[pd->nr_perf_states - 1];
> + ref_freq = arch_scale_freq_ref_em(cpu, pd);
>
> max_util = map_util_perf(max_util);
> max_util = min(max_util, allowed_cpu_cap);
> - freq = map_util_freq(max_util, ps->frequency, scale_cpu);
> + freq = map_util_freq(max_util, ref_freq, scale_cpu);
>
> /*
> * Find the lowest performance state of the Energy Model above the
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 3/4] cpufreq/schedutil: use a fixed reference frequency
2023-09-01 13:03 ` [PATCH 3/4] cpufreq/schedutil: use a fixed reference frequency Vincent Guittot
2023-09-02 10:57 ` Conor Dooley
@ 2023-09-05 11:24 ` Peter Zijlstra
2023-09-05 13:50 ` Vincent Guittot
2023-09-21 9:19 ` Ionela Voinescu
2 siblings, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2023-09-05 11:24 UTC (permalink / raw)
To: Vincent Guittot
Cc: linux, catalin.marinas, will, paul.walmsley, palmer, aou,
sudeep.holla, gregkh, rafael, mingo, juri.lelli, dietmar.eggemann,
rostedt, bsegall, mgorman, bristot, vschneid, viresh.kumar,
linux-arm-kernel, linux-kernel, linux-riscv, linux-pm,
conor.dooley, suagrfillet, ajones, lftan
On Fri, Sep 01, 2023 at 03:03:11PM +0200, Vincent Guittot wrote:
> +#ifdef arch_scale_freq_ref
> +/**
> + * arch_scale_freq_ref_policy - get the reference frequency of a given CPU that
> + * has been used to correlate frequency and compute capacity.
> + * @cpu: the CPU in question.
> + *
> + * Return: the reference CPU frequency.
> + */
> +static __always_inline
> +unsigned long arch_scale_freq_ref_policy(struct cpufreq_policy *policy)
> +{
> + return arch_scale_freq_ref(policy->cpu);
> +}
> +#else
> +static __always_inline
> +unsigned long arch_scale_freq_ref_policy(struct cpufreq_policy *policy)
double space ^^
> +{
> + if (arch_scale_freq_invariant())
> + return policy->cpuinfo.max_freq;
> +
> +
superfluous whitespace there.
> + return policy->cur;
> +}
> +#endif
static __always_inline
unsigned long arch_scale_freq_ref_policy(struct cpufreq_policy *policy)
{
#ifdef arch_scale_freq_ref
return arch_scale_freq_ref(policy->cpu);
#endif
if (arch_scale_freq_invariant())
return policy->cpuinfo.max_freq;
return policy->cur;
}
Would have the lot in a single function and possibly easier to read?
> +
> /**
> * get_next_freq - Compute a new frequency for a given cpufreq policy.
> * @sg_policy: schedutil policy object to compute the new frequency for.
> @@ -139,11 +164,11 @@ static void sugov_deferred_update(struct sugov_policy *sg_policy)
> static unsigned int get_next_freq(struct sugov_policy *sg_policy,
> unsigned long util, unsigned long max)
> {
> + unsigned int freq;
> struct cpufreq_policy *policy = sg_policy->policy;
> - unsigned int freq = arch_scale_freq_invariant() ?
> - policy->cpuinfo.max_freq : policy->cur;
Leave the variable below per the inverse christmas thing.
>
> util = map_util_perf(util);
> + freq = arch_scale_freq_ref_policy(policy);
> freq = map_util_freq(util, freq, max);
>
> if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update)
> --
> 2.34.1
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/4] sched: consolidate and cleanup access to CPU's max compute capacity
2023-09-01 13:03 ` [PATCH 1/4] sched: consolidate and cleanup access to CPU's max compute capacity Vincent Guittot
@ 2023-09-05 11:25 ` Peter Zijlstra
2023-09-14 20:45 ` Dietmar Eggemann
1 sibling, 0 replies; 29+ messages in thread
From: Peter Zijlstra @ 2023-09-05 11:25 UTC (permalink / raw)
To: Vincent Guittot
Cc: linux, catalin.marinas, will, paul.walmsley, palmer, aou,
sudeep.holla, gregkh, rafael, mingo, juri.lelli, dietmar.eggemann,
rostedt, bsegall, mgorman, bristot, vschneid, viresh.kumar,
linux-arm-kernel, linux-kernel, linux-riscv, linux-pm,
conor.dooley, suagrfillet, ajones, lftan
On Fri, Sep 01, 2023 at 03:03:09PM +0200, Vincent Guittot wrote:
> Remove struct rq cpu_capacity_orig field and use arch_scale_cpu_capacity()
> instead.
>
> Scheduler uses 3 methods to get access to the CPU's max compute capacity:
> - arch_scale_cpu_capacity(cpu) which is the default way to get CPU's capacity.
> - cpu_capacity_orig field which is periodically updated with
> arch_scale_cpu_capacity().
> - capacity_orig_of(cpu) which encapsulates rq->cpu_capacity_orig
>
> There is no real need to save the value returned by arch_scale_cpu_capacity()
> in struct rq. arch_scale_cpu_capacity() returns:
> - either a per_cpu variable.
> - or a const value for systems which have only one capacity.
>
> Remove cpu_capacity_orig and use arch_scale_cpu_capacity() everywhere.
>
> No functional changes.
I think the original thinking was that we wouldn't know how expensive
the function call would end up being, but yeah, given how things stand
this is a nice cleanup.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 4/4] energy_model: use a fixed reference frequency
2023-09-05 10:05 ` Pierre Gondois
@ 2023-09-05 11:33 ` Peter Zijlstra
2023-09-05 13:16 ` Vincent Guittot
1 sibling, 0 replies; 29+ messages in thread
From: Peter Zijlstra @ 2023-09-05 11:33 UTC (permalink / raw)
To: Pierre Gondois
Cc: Vincent Guittot, linux, catalin.marinas, will, paul.walmsley,
palmer, aou, sudeep.holla, gregkh, rafael, mingo, juri.lelli,
dietmar.eggemann, rostedt, bsegall, mgorman, bristot, vschneid,
viresh.kumar, linux-arm-kernel, linux-kernel, linux-riscv,
linux-pm, conor.dooley, suagrfillet, ajones, lftan
On Tue, Sep 05, 2023 at 12:05:30PM +0200, Pierre Gondois wrote:
> Hello Vincent,
> I tried the patch-set on a platform using cppc_cpufreq and that has boosting
> frequencies,
>
> 1-
> On such platform, the CPU capacity comes from the CPPC highest_frequency
> field. The CPU capacity is set to the capacity of the boosting frequency.
> This behaviour is different from DT platforms where the CPU capacity is
> updated whenever the boosting mode is enabled (it seems).
>
> Wouldn't it be better to have CPU max capacities set to their boosting
> capacity as for CPPC base platforms ? It seems the max frequency is always
> available somehow for all the cpufreq drivers with boosting available, i.e.
> acpi-cpufreq, amd-pstate, cppc_cpufreq.
So on Intel we don't use the max (turbo) boost value, but typically end
up picking the 4-core turbo value or something. There's a comment in
arch/x86/kernel/cpu/aperfmperf.c.
Per that comment it probably makes sense to be able to differentiate
between a mobile device and a server, or perhaps we can (ab)use the EAS
enable knob for this distinction?
That is, I'm not sure it makes sense to always pick the highest boost
freqency for ARM64 servers, very much analogous to how we don't do that
on Intel.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 4/4] energy_model: use a fixed reference frequency
2023-09-05 10:05 ` Pierre Gondois
2023-09-05 11:33 ` Peter Zijlstra
@ 2023-09-05 13:16 ` Vincent Guittot
1 sibling, 0 replies; 29+ messages in thread
From: Vincent Guittot @ 2023-09-05 13:16 UTC (permalink / raw)
To: Pierre Gondois
Cc: linux, catalin.marinas, will, paul.walmsley, palmer, aou,
sudeep.holla, gregkh, rafael, mingo, peterz, juri.lelli,
dietmar.eggemann, rostedt, bsegall, mgorman, bristot, vschneid,
viresh.kumar, linux-arm-kernel, linux-kernel, linux-riscv,
linux-pm, conor.dooley, suagrfillet, ajones, lftan
On Tue, 5 Sept 2023 at 12:05, Pierre Gondois <pierre.gondois@arm.com> wrote:
>
> Hello Vincent,
> I tried the patch-set on a platform using cppc_cpufreq and that has boosting
> frequencies,
>
> 1-
> On such platform, the CPU capacity comes from the CPPC highest_frequency
> field. The CPU capacity is set to the capacity of the boosting frequency.
> This behaviour is different from DT platforms where the CPU capacity is
> updated whenever the boosting mode is enabled (it seems).
ok, I haven't noticed that cppc_cpufreq would be impacted by this
change in arch_topology. I'm going to check how to fix that
>
> Wouldn't it be better to have CPU max capacities set to their boosting
> capacity as for CPPC base platforms ? It seems the max frequency is always
> available somehow for all the cpufreq drivers with boosting available, i.e.
> acpi-cpufreq, amd-pstate, cppc_cpufreq.
Some platforms will never enable boost or boost is only temporarily
available before being capped. As a result some prefer to use a more
sustainable freq for their max capacity. That's why we can't always
use the max/boost freq
>
>
> 2-
> On the CPPC based platforms, the per_cpu freq_factor is not used/updated,
> meaning that we have:
> arch_scale_freq_ref_em()
> \-arch_scale_freq_ref()
> \-topology_get_freq_ref()
> \-per_cpu(freq_factor, cpu) (set to the default value: 1)
> and em_cpu_energy()'s ref_freq variable is then set to 1 instead of the max
> frequency (leading to a 0 energy computation).
IIUC, cppc uses the default cpu capacity of arch_topology and then
never updates it and it creates an EM for this SMP system.
ok, so you have an EM sets with ACPI and SMP.
I'm going to check where we could set this reference frequency for your case.
>
> 3-
> Also just in case, arch_scale_freq_ref_policy() and cpufreq_get_hw_max_freq()
> seem to have close (but not identical) purpose,
>
> Regards,
> Pierre
>
> On 9/1/23 15:03, Vincent Guittot wrote:
> > The last item of a performance domain is not always the performance point
> > that has been used to compute CPU's capacity. This can lead to different
> > target frequency compared with other part of the system like schedutil and
> > would result in wrong energy estimation.
> >
> > a new arch_scale_freq_ref() is available to return a fixed and coherent
> > frequency reference that can be used when computing the CPU's frequency
> > for an level of utilization. Use this function when available or fallback
> > to the last performance domain item otherwise.
> >
> > Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> > ---
> > include/linux/energy_model.h | 20 +++++++++++++++++---
> > 1 file changed, 17 insertions(+), 3 deletions(-)
> >
> > diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h
> > index b9caa01dfac4..7ee07be6928e 100644
> > --- a/include/linux/energy_model.h
> > +++ b/include/linux/energy_model.h
> > @@ -204,6 +204,20 @@ struct em_perf_state *em_pd_get_efficient_state(struct em_perf_domain *pd,
> > return ps;
> > }
> >
> > +#ifdef arch_scale_freq_ref
> > +static __always_inline
> > +unsigned long arch_scale_freq_ref_em(int cpu, struct em_perf_domain *pd)
> > +{
> > + return arch_scale_freq_ref(cpu);
> > +}
> > +#else
> > +static __always_inline
> > +unsigned long arch_scale_freq_ref_em(int cpu, struct em_perf_domain *pd)
> > +{
> > + return pd->table[pd->nr_perf_states - 1].frequency;
> > +}
> > +#endif
> > +
> > /**
> > * em_cpu_energy() - Estimates the energy consumed by the CPUs of a
> > * performance domain
> > @@ -224,7 +238,7 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd,
> > unsigned long max_util, unsigned long sum_util,
> > unsigned long allowed_cpu_cap)
> > {
> > - unsigned long freq, scale_cpu;
> > + unsigned long freq, ref_freq, scale_cpu;
> > struct em_perf_state *ps;
> > int cpu;
> >
> > @@ -241,11 +255,11 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd,
> > */
> > cpu = cpumask_first(to_cpumask(pd->cpus));
> > scale_cpu = arch_scale_cpu_capacity(cpu);
> > - ps = &pd->table[pd->nr_perf_states - 1];
> > + ref_freq = arch_scale_freq_ref_em(cpu, pd);
> >
> > max_util = map_util_perf(max_util);
> > max_util = min(max_util, allowed_cpu_cap);
> > - freq = map_util_freq(max_util, ps->frequency, scale_cpu);
> > + freq = map_util_freq(max_util, ref_freq, scale_cpu);
> >
> > /*
> > * Find the lowest performance state of the Energy Model above the
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 3/4] cpufreq/schedutil: use a fixed reference frequency
2023-09-05 11:24 ` Peter Zijlstra
@ 2023-09-05 13:50 ` Vincent Guittot
0 siblings, 0 replies; 29+ messages in thread
From: Vincent Guittot @ 2023-09-05 13:50 UTC (permalink / raw)
To: Peter Zijlstra
Cc: linux, catalin.marinas, will, paul.walmsley, palmer, aou,
sudeep.holla, gregkh, rafael, mingo, juri.lelli, dietmar.eggemann,
rostedt, bsegall, mgorman, bristot, vschneid, viresh.kumar,
linux-arm-kernel, linux-kernel, linux-riscv, linux-pm,
conor.dooley, suagrfillet, ajones, lftan
On Tue, 5 Sept 2023 at 13:25, Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Fri, Sep 01, 2023 at 03:03:11PM +0200, Vincent Guittot wrote:
>
> > +#ifdef arch_scale_freq_ref
> > +/**
> > + * arch_scale_freq_ref_policy - get the reference frequency of a given CPU that
> > + * has been used to correlate frequency and compute capacity.
> > + * @cpu: the CPU in question.
> > + *
> > + * Return: the reference CPU frequency.
> > + */
> > +static __always_inline
> > +unsigned long arch_scale_freq_ref_policy(struct cpufreq_policy *policy)
> > +{
> > + return arch_scale_freq_ref(policy->cpu);
> > +}
> > +#else
> > +static __always_inline
> > +unsigned long arch_scale_freq_ref_policy(struct cpufreq_policy *policy)
>
> double space ^^
I was expecting checkpatch.pl to catch it
will fix them
>
> > +{
> > + if (arch_scale_freq_invariant())
> > + return policy->cpuinfo.max_freq;
> > +
> > +
>
> superfluous whitespace there.
>
> > + return policy->cur;
> > +}
> > +#endif
>
> static __always_inline
> unsigned long arch_scale_freq_ref_policy(struct cpufreq_policy *policy)
> {
> #ifdef arch_scale_freq_ref
> return arch_scale_freq_ref(policy->cpu);
> #endif
>
> if (arch_scale_freq_invariant())
> return policy->cpuinfo.max_freq;
>
> return policy->cur;
> }
>
> Would have the lot in a single function and possibly easier to read?
yes
>
> > +
> > /**
> > * get_next_freq - Compute a new frequency for a given cpufreq policy.
> > * @sg_policy: schedutil policy object to compute the new frequency for.
> > @@ -139,11 +164,11 @@ static void sugov_deferred_update(struct sugov_policy *sg_policy)
> > static unsigned int get_next_freq(struct sugov_policy *sg_policy,
> > unsigned long util, unsigned long max)
> > {
> > + unsigned int freq;
> > struct cpufreq_policy *policy = sg_policy->policy;
> > - unsigned int freq = arch_scale_freq_invariant() ?
> > - policy->cpuinfo.max_freq : policy->cur;
>
> Leave the variable below per the inverse christmas thing.
>
> >
> > util = map_util_perf(util);
> > + freq = arch_scale_freq_ref_policy(policy);
> > freq = map_util_freq(util, freq, max);
> >
> > if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update)
> > --
> > 2.34.1
> >
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/4] sched: consolidate and cleanup access to CPU's max compute capacity
2023-09-01 13:03 ` [PATCH 1/4] sched: consolidate and cleanup access to CPU's max compute capacity Vincent Guittot
2023-09-05 11:25 ` Peter Zijlstra
@ 2023-09-14 20:45 ` Dietmar Eggemann
2023-09-15 13:20 ` Vincent Guittot
1 sibling, 1 reply; 29+ messages in thread
From: Dietmar Eggemann @ 2023-09-14 20:45 UTC (permalink / raw)
To: Vincent Guittot, linux, catalin.marinas, will, paul.walmsley,
palmer, aou, sudeep.holla, gregkh, rafael, mingo, peterz,
juri.lelli, rostedt, bsegall, mgorman, bristot, vschneid,
viresh.kumar, linux-arm-kernel, linux-kernel, linux-riscv,
linux-pm
Cc: conor.dooley, suagrfillet, ajones, lftan
On 01/09/2023 15:03, Vincent Guittot wrote:
> Remove struct rq cpu_capacity_orig field and use arch_scale_cpu_capacity()
> instead.
>
> Scheduler uses 3 methods to get access to the CPU's max compute capacity:
> - arch_scale_cpu_capacity(cpu) which is the default way to get CPU's capacity.
> - cpu_capacity_orig field which is periodically updated with
> arch_scale_cpu_capacity().
> - capacity_orig_of(cpu) which encapsulates rq->cpu_capacity_orig
>
> There is no real need to save the value returned by arch_scale_cpu_capacity()
> in struct rq. arch_scale_cpu_capacity() returns:
> - either a per_cpu variable.
> - or a const value for systems which have only one capacity.
>
> Remove cpu_capacity_orig and use arch_scale_cpu_capacity() everywhere.
>
> No functional changes.
>
> some tests of Arm64:
> small SMP device (hikey): no noticeable changes
> HMP device (RB5): hackbench shows minor improvement (1-2%)
> large smp (thx2): hackbench and tbench shows minor improvement (1%)
>
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Next to util_fits_cpu() which uses capacity_orig as a local variable
(which is fine) there is sis() referring to capacity_orig in a comment.
Documentation/scheduler/sched-capacity.rst uses the term `capacity_orig`
in chapter 1.2 to explain the difference between CPU's maximum
(attainable) capacity and capacity as the former reduced by pressure.
Not sure if you want to change those refs as well with this patch?
People might get confused about the term `capacity_orig` pretty soon.
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 2/4] topology: add a new arch_scale_freq_reference
2023-09-01 13:03 ` [PATCH 2/4] topology: add a new arch_scale_freq_reference Vincent Guittot
2023-09-04 12:35 ` Lukasz Luba
@ 2023-09-14 21:01 ` Dietmar Eggemann
2023-09-21 9:00 ` Ionela Voinescu
2 siblings, 0 replies; 29+ messages in thread
From: Dietmar Eggemann @ 2023-09-14 21:01 UTC (permalink / raw)
To: Vincent Guittot, linux, catalin.marinas, will, paul.walmsley,
palmer, aou, sudeep.holla, gregkh, rafael, mingo, peterz,
juri.lelli, rostedt, bsegall, mgorman, bristot, vschneid,
viresh.kumar, linux-arm-kernel, linux-kernel, linux-riscv,
linux-pm
Cc: conor.dooley, suagrfillet, ajones, lftan
On 01/09/2023 15:03, Vincent Guittot wrote:
> Create a new method to get a unique and fixed max frequency. Currently
> cpuinfo.max_freq or last item of performance domain are used as the max
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
IMHO better `highest (or last) performance state` (struct em_perf_state)
... maybe easier to grasp?
[...]
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 4/4] energy_model: use a fixed reference frequency
2023-09-01 13:03 ` [PATCH 4/4] energy_model: " Vincent Guittot
2023-09-04 12:40 ` Lukasz Luba
2023-09-05 10:05 ` Pierre Gondois
@ 2023-09-14 21:07 ` Dietmar Eggemann
2023-09-15 13:35 ` Vincent Guittot
2023-09-21 10:12 ` Ionela Voinescu
3 siblings, 1 reply; 29+ messages in thread
From: Dietmar Eggemann @ 2023-09-14 21:07 UTC (permalink / raw)
To: Vincent Guittot, linux, catalin.marinas, will, paul.walmsley,
palmer, aou, sudeep.holla, gregkh, rafael, mingo, peterz,
juri.lelli, rostedt, bsegall, mgorman, bristot, vschneid,
viresh.kumar, linux-arm-kernel, linux-kernel, linux-riscv,
linux-pm
Cc: conor.dooley, suagrfillet, ajones, lftan
On 01/09/2023 15:03, Vincent Guittot wrote:
[...]
> diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h
> index b9caa01dfac4..7ee07be6928e 100644
> --- a/include/linux/energy_model.h
> +++ b/include/linux/energy_model.h
> @@ -204,6 +204,20 @@ struct em_perf_state *em_pd_get_efficient_state(struct em_perf_domain *pd,
> return ps;
> }
>
> +#ifdef arch_scale_freq_ref
> +static __always_inline
> +unsigned long arch_scale_freq_ref_em(int cpu, struct em_perf_domain *pd)
Why is this function named with the arch prefix?
So far we have 5 arch functions (arch_scale_freq_tick() <->
arch_scale_freq_ref()) and e.g. Arm/Arm64 defines them with there
topology_foo implementations.
Isn't arch_scale_freq_ref_em() (as well as arch_scale_freq_ref_policy())
different in this sense and so a proper EM function which should
manifest in its name?
> +{
> + return arch_scale_freq_ref(cpu);
> +}
> +#else
> +static __always_inline
> +unsigned long arch_scale_freq_ref_em(int cpu, struct em_perf_domain *pd)
> +{
> + return pd->table[pd->nr_perf_states - 1].frequency;
> +}
> +#endif
[...]
> @@ -241,11 +255,11 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd,
> */
> cpu = cpumask_first(to_cpumask(pd->cpus));
> scale_cpu = arch_scale_cpu_capacity(cpu);
> - ps = &pd->table[pd->nr_perf_states - 1];
> + ref_freq = arch_scale_freq_ref_em(cpu, pd);
Why not using existing `unsigned long freq` here like in schedutil's
get_next_freq()?
>
> max_util = map_util_perf(max_util);
[...]
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/4] sched: consolidate and cleanup access to CPU's max compute capacity
2023-09-14 20:45 ` Dietmar Eggemann
@ 2023-09-15 13:20 ` Vincent Guittot
0 siblings, 0 replies; 29+ messages in thread
From: Vincent Guittot @ 2023-09-15 13:20 UTC (permalink / raw)
To: Dietmar Eggemann
Cc: linux, catalin.marinas, will, paul.walmsley, palmer, aou,
sudeep.holla, gregkh, rafael, mingo, peterz, juri.lelli, rostedt,
bsegall, mgorman, bristot, vschneid, viresh.kumar,
linux-arm-kernel, linux-kernel, linux-riscv, linux-pm,
conor.dooley, suagrfillet, ajones, lftan
On Thu, 14 Sept 2023 at 22:46, Dietmar Eggemann
<dietmar.eggemann@arm.com> wrote:
>
> On 01/09/2023 15:03, Vincent Guittot wrote:
> > Remove struct rq cpu_capacity_orig field and use arch_scale_cpu_capacity()
> > instead.
> >
> > Scheduler uses 3 methods to get access to the CPU's max compute capacity:
> > - arch_scale_cpu_capacity(cpu) which is the default way to get CPU's capacity.
> > - cpu_capacity_orig field which is periodically updated with
> > arch_scale_cpu_capacity().
> > - capacity_orig_of(cpu) which encapsulates rq->cpu_capacity_orig
> >
> > There is no real need to save the value returned by arch_scale_cpu_capacity()
> > in struct rq. arch_scale_cpu_capacity() returns:
> > - either a per_cpu variable.
> > - or a const value for systems which have only one capacity.
> >
> > Remove cpu_capacity_orig and use arch_scale_cpu_capacity() everywhere.
> >
> > No functional changes.
> >
> > some tests of Arm64:
> > small SMP device (hikey): no noticeable changes
> > HMP device (RB5): hackbench shows minor improvement (1-2%)
> > large smp (thx2): hackbench and tbench shows minor improvement (1%)
> >
> > Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
>
> Next to util_fits_cpu() which uses capacity_orig as a local variable
> (which is fine) there is sis() referring to capacity_orig in a comment.
>
> Documentation/scheduler/sched-capacity.rst uses the term `capacity_orig`
> in chapter 1.2 to explain the difference between CPU's maximum
> (attainable) capacity and capacity as the former reduced by pressure.
ok, I will have a look at those references to capacity_orig
>
> Not sure if you want to change those refs as well with this patch?
> People might get confused about the term `capacity_orig` pretty soon.
>
> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Thanks
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 4/4] energy_model: use a fixed reference frequency
2023-09-14 21:07 ` Dietmar Eggemann
@ 2023-09-15 13:35 ` Vincent Guittot
2023-09-18 20:46 ` Dietmar Eggemann
0 siblings, 1 reply; 29+ messages in thread
From: Vincent Guittot @ 2023-09-15 13:35 UTC (permalink / raw)
To: Dietmar Eggemann
Cc: linux, catalin.marinas, will, paul.walmsley, palmer, aou,
sudeep.holla, gregkh, rafael, mingo, peterz, juri.lelli, rostedt,
bsegall, mgorman, bristot, vschneid, viresh.kumar,
linux-arm-kernel, linux-kernel, linux-riscv, linux-pm,
conor.dooley, suagrfillet, ajones, lftan
On Thu, 14 Sept 2023 at 23:07, Dietmar Eggemann
<dietmar.eggemann@arm.com> wrote:
>
> On 01/09/2023 15:03, Vincent Guittot wrote:
>
> [...]
>
> > diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h
> > index b9caa01dfac4..7ee07be6928e 100644
> > --- a/include/linux/energy_model.h
> > +++ b/include/linux/energy_model.h
> > @@ -204,6 +204,20 @@ struct em_perf_state *em_pd_get_efficient_state(struct em_perf_domain *pd,
> > return ps;
> > }
> >
> > +#ifdef arch_scale_freq_ref
> > +static __always_inline
> > +unsigned long arch_scale_freq_ref_em(int cpu, struct em_perf_domain *pd)
>
> Why is this function named with the arch prefix?
>
> So far we have 5 arch functions (arch_scale_freq_tick() <->
> arch_scale_freq_ref()) and e.g. Arm/Arm64 defines them with there
> topology_foo implementations.
>
> Isn't arch_scale_freq_ref_em() (as well as arch_scale_freq_ref_policy())
> different in this sense and so a proper EM function which should
> manifest in its name?
arch_scale_freq_ref_em() is there to handle cases where
arch_scale_freq_ref() is not defined by arch. I keep arch_ prefix
because this should be provided by architecture which wants to use EM.
In the case of EM, it's only there for allyes|randconfig on arch that
doesn't use arch_topology.c like x86_64
>
> > +{
> > + return arch_scale_freq_ref(cpu);
> > +}
> > +#else
> > +static __always_inline
> > +unsigned long arch_scale_freq_ref_em(int cpu, struct em_perf_domain *pd)
> > +{
> > + return pd->table[pd->nr_perf_states - 1].frequency;
> > +}
> > +#endif
>
> [...]
>
> > @@ -241,11 +255,11 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd,
> > */
> > cpu = cpumask_first(to_cpumask(pd->cpus));
> > scale_cpu = arch_scale_cpu_capacity(cpu);
> > - ps = &pd->table[pd->nr_perf_states - 1];
> > + ref_freq = arch_scale_freq_ref_em(cpu, pd);
>
> Why not using existing `unsigned long freq` here like in schedutil's
> get_next_freq()?
Find it easier to read and understand and will not make any difference
in the compiled code
>
> >
> > max_util = map_util_perf(max_util);
>
> [...]
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 2/4] topology: add a new arch_scale_freq_reference
2023-09-04 12:35 ` Lukasz Luba
@ 2023-09-18 11:23 ` Vincent Guittot
0 siblings, 0 replies; 29+ messages in thread
From: Vincent Guittot @ 2023-09-18 11:23 UTC (permalink / raw)
To: Lukasz Luba
Cc: conor.dooley, linux-arm-kernel, suagrfillet, juri.lelli,
sudeep.holla, palmer, linux-riscv, ajones, lftan, linux-kernel,
linux-pm, linux, bristot, catalin.marinas, bsegall, will,
vschneid, rostedt, rafael, dietmar.eggemann, aou, mingo,
paul.walmsley, mgorman, gregkh, peterz, viresh.kumar
On Mon, 4 Sept 2023 at 14:34, Lukasz Luba <lukasz.luba@arm.com> wrote:
>
> Hi Vincent,
>
>
> On 9/1/23 14:03, Vincent Guittot wrote:
> > Create a new method to get a unique and fixed max frequency. Currently
> > cpuinfo.max_freq or last item of performance domain are used as the max
> > frequency when computing the frequency for a level of utilization but:
> > - cpuinfo_max_freq can change at runtime. boost is one example of
> > such change.
> > - cpuinfo.max_freq and last item of the PD can be different leading to
> > different results betwen cpufreq and energy model.
> >
> > We need to save the max frequency that has been used when computing the
> > CPUs capacity and use this fixed and coherent value to convert between
> > frequency and CPU's capacity.
> >
> > In fact, we already save the frequency that has been used when computing
> > the capacity of each CPU. We extend the precision to save khZ instead of
> > Mhz currently and we modify the type to be aligned with other variables
> > used when converting frequency to capacity and the other way.
>
> I do like this 'kHz' change. We also use kHz in the EM, so better
> aligned now.
>
> >
> > Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> > ---
> > arch/arm/include/asm/topology.h | 1 +
> > arch/arm64/include/asm/topology.h | 1 +
> > arch/riscv/include/asm/topology.h | 1 +
> > drivers/base/arch_topology.c | 9 +++------
> > include/linux/arch_topology.h | 7 +++++++
> > 5 files changed, 13 insertions(+), 6 deletions(-)
> >
> > diff --git a/arch/arm/include/asm/topology.h b/arch/arm/include/asm/topology.h
> > index c7d2510e5a78..853c4f81ba4a 100644
> > --- a/arch/arm/include/asm/topology.h
> > +++ b/arch/arm/include/asm/topology.h
> > @@ -13,6 +13,7 @@
> > #define arch_set_freq_scale topology_set_freq_scale
> > #define arch_scale_freq_capacity topology_get_freq_scale
> > #define arch_scale_freq_invariant topology_scale_freq_invariant
> > +#define arch_scale_freq_ref topology_get_freq_ref
> > #endif
> >
> > /* Replace task scheduler's default cpu-invariant accounting */
> > diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h
> > index 9fab663dd2de..a323b109b9c4 100644
> > --- a/arch/arm64/include/asm/topology.h
> > +++ b/arch/arm64/include/asm/topology.h
> > @@ -23,6 +23,7 @@ void update_freq_counters_refs(void);
> > #define arch_set_freq_scale topology_set_freq_scale
> > #define arch_scale_freq_capacity topology_get_freq_scale
> > #define arch_scale_freq_invariant topology_scale_freq_invariant
> > +#define arch_scale_freq_ref topology_get_freq_ref
> >
> > #ifdef CONFIG_ACPI_CPPC_LIB
> > #define arch_init_invariance_cppc topology_init_cpu_capacity_cppc
> > diff --git a/arch/riscv/include/asm/topology.h b/arch/riscv/include/asm/topology.h
> > index e316ab3b77f3..61183688bdd5 100644
> > --- a/arch/riscv/include/asm/topology.h
> > +++ b/arch/riscv/include/asm/topology.h
> > @@ -9,6 +9,7 @@
> > #define arch_set_freq_scale topology_set_freq_scale
> > #define arch_scale_freq_capacity topology_get_freq_scale
> > #define arch_scale_freq_invariant topology_scale_freq_invariant
> > +#define arch_scale_freq_ref topology_get_freq_ref
> >
> > /* Replace task scheduler's default cpu-invariant accounting */
> > #define arch_scale_cpu_capacity topology_get_cpu_scale
> > diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> > index b741b5ba82bd..75fa67477a9d 100644
> > --- a/drivers/base/arch_topology.c
> > +++ b/drivers/base/arch_topology.c
> > @@ -26,7 +26,7 @@
> > static DEFINE_PER_CPU(struct scale_freq_data __rcu *, sft_data);
> > static struct cpumask scale_freq_counters_mask;
> > static bool scale_freq_invariant;
> > -static DEFINE_PER_CPU(u32, freq_factor) = 1;
> > +DEFINE_PER_CPU(unsigned long, freq_factor) = 1;
>
> Why it's not static now?
it can be accessed outside with inline function like cpu_scale and
arch_freq_scale
>
> >
> > static bool supports_scale_freq_counters(const struct cpumask *cpus)
> > {
> > @@ -183,10 +183,7 @@ void topology_update_thermal_pressure(const struct cpumask *cpus,
> >
> > cpu = cpumask_first(cpus);
> > max_capacity = arch_scale_cpu_capacity(cpu);
> > - max_freq = per_cpu(freq_factor, cpu);
> > -
> > - /* Convert to MHz scale which is used in 'freq_factor' */
> > - capped_freq /= 1000;
> > + max_freq = arch_scale_freq_ref(cpu);
> >
> > /*
> > * Handle properly the boost frequencies, which should simply clean
> > @@ -411,7 +408,7 @@ init_cpu_capacity_callback(struct notifier_block *nb,
> > cpumask_andnot(cpus_to_visit, cpus_to_visit, policy->related_cpus);
> >
> > for_each_cpu(cpu, policy->related_cpus)
> > - per_cpu(freq_factor, cpu) = policy->cpuinfo.max_freq / 1000;
> > + per_cpu(freq_factor, cpu) = policy->cpuinfo.max_freq;
> >
> > if (cpumask_empty(cpus_to_visit)) {
> > topology_normalize_cpu_scale();
> > diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
> > index a07b510e7dc5..7a2dba9c3dc0 100644
> > --- a/include/linux/arch_topology.h
> > +++ b/include/linux/arch_topology.h
> > @@ -27,6 +27,13 @@ static inline unsigned long topology_get_cpu_scale(int cpu)
> >
> > void topology_set_cpu_scale(unsigned int cpu, unsigned long capacity);
> >
> > +DECLARE_PER_CPU(unsigned long, freq_factor);
> > +
> > +static inline unsigned long topology_get_freq_ref(int cpu)
> > +{
> > + return per_cpu(freq_factor, cpu);
> > +}
> > +
> > DECLARE_PER_CPU(unsigned long, arch_freq_scale);
> >
> > static inline unsigned long topology_get_freq_scale(int cpu)
>
> Apart from that 'static' missing, that looks good.
>
> Regards,
> Lukasz
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 4/4] energy_model: use a fixed reference frequency
2023-09-15 13:35 ` Vincent Guittot
@ 2023-09-18 20:46 ` Dietmar Eggemann
0 siblings, 0 replies; 29+ messages in thread
From: Dietmar Eggemann @ 2023-09-18 20:46 UTC (permalink / raw)
To: Vincent Guittot
Cc: linux, catalin.marinas, will, paul.walmsley, palmer, aou,
sudeep.holla, gregkh, rafael, mingo, peterz, juri.lelli, rostedt,
bsegall, mgorman, bristot, vschneid, viresh.kumar,
linux-arm-kernel, linux-kernel, linux-riscv, linux-pm,
conor.dooley, suagrfillet, ajones, lftan
On 15/09/2023 15:35, Vincent Guittot wrote:
> On Thu, 14 Sept 2023 at 23:07, Dietmar Eggemann
> <dietmar.eggemann@arm.com> wrote:
>>
>> On 01/09/2023 15:03, Vincent Guittot wrote:
[...]
>>> +#ifdef arch_scale_freq_ref
>>> +static __always_inline
>>> +unsigned long arch_scale_freq_ref_em(int cpu, struct em_perf_domain *pd)
>>
>> Why is this function named with the arch prefix?
>>
>> So far we have 5 arch functions (arch_scale_freq_tick() <->
>> arch_scale_freq_ref()) and e.g. Arm/Arm64 defines them with there
>> topology_foo implementations.
>>
>> Isn't arch_scale_freq_ref_em() (as well as arch_scale_freq_ref_policy())
>> different in this sense and so a proper EM function which should
>> manifest in its name?
>
> arch_scale_freq_ref_em() is there to handle cases where
> arch_scale_freq_ref() is not defined by arch. I keep arch_ prefix
> because this should be provided by architecture which wants to use EM.
That's correct, x86_64 with CONFIG_ENERGY_MODEL=y needs
arch_scale_freq_ref_em() returning highest perf_state of the perf_domain.
But this function as opposed to arch_scale_freq_ref() does not have to
be provided by the arch itself. It's provided by the EM instead.
That's why my doubt whether it should be named arch_scale_freq_ref_em().
> In the case of EM, it's only there for allyes|randconfig on arch that
> doesn't use arch_topology.c like x86_64
[...]
>>> @@ -241,11 +255,11 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd,
>>> */
>>> cpu = cpumask_first(to_cpumask(pd->cpus));
>>> scale_cpu = arch_scale_cpu_capacity(cpu);
>>> - ps = &pd->table[pd->nr_perf_states - 1];
>>> + ref_freq = arch_scale_freq_ref_em(cpu, pd);
>>
>> Why not using existing `unsigned long freq` here like in schedutil's
>> get_next_freq()?
>
> Find it easier to read and understand and will not make any difference
> in the compiled code
True but I thought it's easier to be able to detect the functional
similarity between em_cpu_energy() (*) and get_next_freq().
freq = arch_scale_freq_ref_{policy,em}({policy,(cpu, pd)});
... (in case of *)
freq = map_util_freq(util, freq, max);
Just a nitpick ...
[...]
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 2/4] topology: add a new arch_scale_freq_reference
2023-09-01 13:03 ` [PATCH 2/4] topology: add a new arch_scale_freq_reference Vincent Guittot
2023-09-04 12:35 ` Lukasz Luba
2023-09-14 21:01 ` Dietmar Eggemann
@ 2023-09-21 9:00 ` Ionela Voinescu
2023-09-25 12:06 ` Vincent Guittot
2 siblings, 1 reply; 29+ messages in thread
From: Ionela Voinescu @ 2023-09-21 9:00 UTC (permalink / raw)
To: Vincent Guittot
Cc: linux, catalin.marinas, will, paul.walmsley, palmer, aou,
sudeep.holla, gregkh, rafael, mingo, peterz, juri.lelli,
dietmar.eggemann, rostedt, bsegall, mgorman, bristot, vschneid,
viresh.kumar, linux-arm-kernel, linux-kernel, linux-riscv,
linux-pm, conor.dooley, suagrfillet, ajones, lftan
Hi Vincent,
On Friday 01 Sep 2023 at 15:03:10 (+0200), Vincent Guittot wrote:
> Create a new method to get a unique and fixed max frequency. Currently
> cpuinfo.max_freq or last item of performance domain are used as the max
> frequency when computing the frequency for a level of utilization but:
> - cpuinfo_max_freq can change at runtime. boost is one example of
> such change.
> - cpuinfo.max_freq and last item of the PD can be different leading to
> different results betwen cpufreq and energy model.
>
> We need to save the max frequency that has been used when computing the
> CPUs capacity and use this fixed and coherent value to convert between
> frequency and CPU's capacity.
>
> In fact, we already save the frequency that has been used when computing
> the capacity of each CPU. We extend the precision to save khZ instead of
> Mhz currently and we modify the type to be aligned with other variables
> used when converting frequency to capacity and the other way.
>
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> ---
> arch/arm/include/asm/topology.h | 1 +
> arch/arm64/include/asm/topology.h | 1 +
> arch/riscv/include/asm/topology.h | 1 +
> drivers/base/arch_topology.c | 9 +++------
> include/linux/arch_topology.h | 7 +++++++
> 5 files changed, 13 insertions(+), 6 deletions(-)
>
> diff --git a/arch/arm/include/asm/topology.h b/arch/arm/include/asm/topology.h
> index c7d2510e5a78..853c4f81ba4a 100644
> --- a/arch/arm/include/asm/topology.h
> +++ b/arch/arm/include/asm/topology.h
> @@ -13,6 +13,7 @@
> #define arch_set_freq_scale topology_set_freq_scale
> #define arch_scale_freq_capacity topology_get_freq_scale
> #define arch_scale_freq_invariant topology_scale_freq_invariant
> +#define arch_scale_freq_ref topology_get_freq_ref
The "reference" frequency or performance of a CPU has a specific meaning
in the ACPI specification as a fixed frequency of a constant clock and its
associated performance level, usually used for performance to frequency
conversions. This is not guaranteed to be the maximum
performance/frequency and it's definitely not the case for arm systems
where this is tied to the system timer - 1-50Mhz.
So I believe it might create some confusion to call this "reference"
frequency.
Any reason for not calling it arch_scale_freq_max? I know not all max
frequencies are equal :) but it would still drive the point that this is
intended to act as a chosen maximum for the scheduler's use, especially
if there's a well place comment.
> #endif
>
> /* Replace task scheduler's default cpu-invariant accounting */
> diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h
> index 9fab663dd2de..a323b109b9c4 100644
> --- a/arch/arm64/include/asm/topology.h
> +++ b/arch/arm64/include/asm/topology.h
> @@ -23,6 +23,7 @@ void update_freq_counters_refs(void);
> #define arch_set_freq_scale topology_set_freq_scale
> #define arch_scale_freq_capacity topology_get_freq_scale
> #define arch_scale_freq_invariant topology_scale_freq_invariant
> +#define arch_scale_freq_ref topology_get_freq_ref
>
> #ifdef CONFIG_ACPI_CPPC_LIB
> #define arch_init_invariance_cppc topology_init_cpu_capacity_cppc
> diff --git a/arch/riscv/include/asm/topology.h b/arch/riscv/include/asm/topology.h
> index e316ab3b77f3..61183688bdd5 100644
> --- a/arch/riscv/include/asm/topology.h
> +++ b/arch/riscv/include/asm/topology.h
> @@ -9,6 +9,7 @@
> #define arch_set_freq_scale topology_set_freq_scale
> #define arch_scale_freq_capacity topology_get_freq_scale
> #define arch_scale_freq_invariant topology_scale_freq_invariant
> +#define arch_scale_freq_ref topology_get_freq_ref
>
> /* Replace task scheduler's default cpu-invariant accounting */
> #define arch_scale_cpu_capacity topology_get_cpu_scale
> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index b741b5ba82bd..75fa67477a9d 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -26,7 +26,7 @@
> static DEFINE_PER_CPU(struct scale_freq_data __rcu *, sft_data);
> static struct cpumask scale_freq_counters_mask;
> static bool scale_freq_invariant;
> -static DEFINE_PER_CPU(u32, freq_factor) = 1;
> +DEFINE_PER_CPU(unsigned long, freq_factor) = 1;
Given its new wider use, it might be good for this to get a more
relevant name as well. Previously freq_factor made sense for its role
in using the dmips/mhz values to obtain CPU capacities. But if this is
now returned from arch_scale_freq_ref() it would be more difficult still
to easily understand what this value is supposed to reflect, when
reading the function name or the per-cpu variable name alone.
Thanks,
Ionela.
>
> static bool supports_scale_freq_counters(const struct cpumask *cpus)
> {
> @@ -183,10 +183,7 @@ void topology_update_thermal_pressure(const struct cpumask *cpus,
>
> cpu = cpumask_first(cpus);
> max_capacity = arch_scale_cpu_capacity(cpu);
> - max_freq = per_cpu(freq_factor, cpu);
> -
> - /* Convert to MHz scale which is used in 'freq_factor' */
> - capped_freq /= 1000;
> + max_freq = arch_scale_freq_ref(cpu);
>
> /*
> * Handle properly the boost frequencies, which should simply clean
> @@ -411,7 +408,7 @@ init_cpu_capacity_callback(struct notifier_block *nb,
> cpumask_andnot(cpus_to_visit, cpus_to_visit, policy->related_cpus);
>
> for_each_cpu(cpu, policy->related_cpus)
> - per_cpu(freq_factor, cpu) = policy->cpuinfo.max_freq / 1000;
> + per_cpu(freq_factor, cpu) = policy->cpuinfo.max_freq;
>
> if (cpumask_empty(cpus_to_visit)) {
> topology_normalize_cpu_scale();
> diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
> index a07b510e7dc5..7a2dba9c3dc0 100644
> --- a/include/linux/arch_topology.h
> +++ b/include/linux/arch_topology.h
> @@ -27,6 +27,13 @@ static inline unsigned long topology_get_cpu_scale(int cpu)
>
> void topology_set_cpu_scale(unsigned int cpu, unsigned long capacity);
>
> +DECLARE_PER_CPU(unsigned long, freq_factor);
> +
> +static inline unsigned long topology_get_freq_ref(int cpu)
> +{
> + return per_cpu(freq_factor, cpu);
> +}
> +
> DECLARE_PER_CPU(unsigned long, arch_freq_scale);
>
> static inline unsigned long topology_get_freq_scale(int cpu)
> --
> 2.34.1
>
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 3/4] cpufreq/schedutil: use a fixed reference frequency
2023-09-01 13:03 ` [PATCH 3/4] cpufreq/schedutil: use a fixed reference frequency Vincent Guittot
2023-09-02 10:57 ` Conor Dooley
2023-09-05 11:24 ` Peter Zijlstra
@ 2023-09-21 9:19 ` Ionela Voinescu
2023-09-25 12:06 ` Vincent Guittot
2 siblings, 1 reply; 29+ messages in thread
From: Ionela Voinescu @ 2023-09-21 9:19 UTC (permalink / raw)
To: Vincent Guittot
Cc: linux, catalin.marinas, will, paul.walmsley, palmer, aou,
sudeep.holla, gregkh, rafael, mingo, peterz, juri.lelli,
dietmar.eggemann, rostedt, bsegall, mgorman, bristot, vschneid,
viresh.kumar, linux-arm-kernel, linux-kernel, linux-riscv,
linux-pm, conor.dooley, suagrfillet, ajones, lftan
On Friday 01 Sep 2023 at 15:03:11 (+0200), Vincent Guittot wrote:
> cpuinfo_max_freq can change at runtime because of boost as example. This
> implies that the value could not be the same than the one that has been
> used when computing the capacity of a CPU.
>
> The new arch_scale_freq_ref() returns a fixed and coherent frequency
> reference that can be used when computing a frequency based on utilization.
>
> Use this arch_scale_freq_ref() when available and fallback to
> cpuinfo.max_freq otherwise.
>
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> ---
> kernel/sched/cpufreq_schedutil.c | 29 +++++++++++++++++++++++++++--
> 1 file changed, 27 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> index 4492608b7d7f..9996ef429e2b 100644
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -114,6 +114,31 @@ static void sugov_deferred_update(struct sugov_policy *sg_policy)
> }
> }
>
> +#ifdef arch_scale_freq_ref
> +/**
> + * arch_scale_freq_ref_policy - get the reference frequency of a given CPU that
> + * has been used to correlate frequency and compute capacity.
> + * @cpu: the CPU in question.
> + *
> + * Return: the reference CPU frequency.
> + */
> +static __always_inline
> +unsigned long arch_scale_freq_ref_policy(struct cpufreq_policy *policy)
This should not be an arch_ function as it's only a wrapper over an
arch_ function and not a function that different architectures might
implement differently usually in architecture specific code.
> +{
> + return arch_scale_freq_ref(policy->cpu);
It might make it easier to read if arch_scale_freq_ref() had a default
implementation that returned 0.
Then this code would simply become:
freq = arch_scale_freq_ref(policy->cpu);
if (freq)
return freq;
else if (arch_scale_freq_invariant())
return ..
..
This approach is similar to the use of arch_freq_get_on_cpu() in
cpufreq.c, and, as there, having a chosen maximum frequency of 0 would
not be a valid value.
> +}
> +#else
> +static __always_inline
> +unsigned long arch_scale_freq_ref_policy(struct cpufreq_policy *policy)
> +{
> + if (arch_scale_freq_invariant())
> + return policy->cpuinfo.max_freq;
> +
> +
> + return policy->cur;
> +}
> +#endif
> +
> /**
> * get_next_freq - Compute a new frequency for a given cpufreq policy.
> * @sg_policy: schedutil policy object to compute the new frequency for.
> @@ -139,11 +164,11 @@ static void sugov_deferred_update(struct sugov_policy *sg_policy)
> static unsigned int get_next_freq(struct sugov_policy *sg_policy,
> unsigned long util, unsigned long max)
> {
> + unsigned int freq;
> struct cpufreq_policy *policy = sg_policy->policy;
> - unsigned int freq = arch_scale_freq_invariant() ?
> - policy->cpuinfo.max_freq : policy->cur;
>
> util = map_util_perf(util);
> + freq = arch_scale_freq_ref_policy(policy);
Given its single use here, it would likely be better to place the code
above directly here, rather than create a wrapper over a few lines of
code.
Hope it helps,
Ionela.
> freq = map_util_freq(util, freq, max);
>
> if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update)
> --
> 2.34.1
>
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/4] consolidate and cleanup CPU capacity
2023-09-01 13:03 [PATCH 0/4] consolidate and cleanup CPU capacity Vincent Guittot
` (3 preceding siblings ...)
2023-09-01 13:03 ` [PATCH 4/4] energy_model: " Vincent Guittot
@ 2023-09-21 10:12 ` Ionela Voinescu
2023-09-25 12:05 ` Vincent Guittot
4 siblings, 1 reply; 29+ messages in thread
From: Ionela Voinescu @ 2023-09-21 10:12 UTC (permalink / raw)
To: Vincent Guittot
Cc: linux, catalin.marinas, will, paul.walmsley, palmer, aou,
sudeep.holla, gregkh, rafael, mingo, peterz, juri.lelli,
dietmar.eggemann, rostedt, bsegall, mgorman, bristot, vschneid,
viresh.kumar, linux-arm-kernel, linux-kernel, linux-riscv,
linux-pm, conor.dooley, suagrfillet, ajones, lftan
On Friday 01 Sep 2023 at 15:03:08 (+0200), Vincent Guittot wrote:
> This is the 1st part of consolidating how the max compute capacity is
> used in the scheduler and how we calculate the frequency for a level of
> utilization.
>
> Fix some unconsistancy when computing frequency for an utilization. There
> can be a mismatch between energy model and schedutil.
There are a few more pieces of functionality that would be worth
consolidating in this set as well, if you'd like to consider them:
- arch_set_freq_scale() still uses policy->cpuinfo.max_freq. It might be
good to use the boot time stored max_freq here as well. Given that
arch_scale_cpu_capacity() would be based on that stored value, if
arch_scale_freq_capacity() ends up using a different value, it could
have interesting effects on the utilization signals in case of
boosting.
- As Pierre mentioned in a previous comment, there is already a
cpufreq_get_hw_max_freq() weak function that returns
policy->cpuinfo.max_freq and it's only used at boot time by
the setup code for AMU use for frequency invariance. I'm tempted to
suggest to use this to initialize what is now "freq_factor" as my
intention when I created that function was to provide platform
providers with the possibility to implement their own and decide on
the frequency they choose as their maximum. This could have been an
arch_ function as well, but as you mentioned before, mobile and server
platforms might want to choose different maximum values even if they
are using the same architecture.
Thanks,
Ionela.
>
> Next step will be to make a difference between the original
> max compute capacity of a CPU and what is currently available when
> there is a capping applying forever (i.e. seconds or more).
>
> Vincent Guittot (4):
> sched: consolidate and cleanup access to CPU's max compute capacity
> topology: add a new arch_scale_freq_reference
> cpufreq/schedutil: use a fixed reference frequency
> energy_model: use a fixed reference frequency
>
> arch/arm/include/asm/topology.h | 1 +
> arch/arm64/include/asm/topology.h | 1 +
> arch/riscv/include/asm/topology.h | 1 +
> drivers/base/arch_topology.c | 9 +++------
> include/linux/arch_topology.h | 7 +++++++
> include/linux/energy_model.h | 20 +++++++++++++++++---
> kernel/sched/core.c | 2 +-
> kernel/sched/cpudeadline.c | 2 +-
> kernel/sched/cpufreq_schedutil.c | 29 +++++++++++++++++++++++++++--
> kernel/sched/deadline.c | 4 ++--
> kernel/sched/fair.c | 18 ++++++++----------
> kernel/sched/rt.c | 2 +-
> kernel/sched/sched.h | 6 ------
> kernel/sched/topology.c | 7 +++++--
> 14 files changed, 75 insertions(+), 34 deletions(-)
>
> --
> 2.34.1
>
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 4/4] energy_model: use a fixed reference frequency
2023-09-01 13:03 ` [PATCH 4/4] energy_model: " Vincent Guittot
` (2 preceding siblings ...)
2023-09-14 21:07 ` Dietmar Eggemann
@ 2023-09-21 10:12 ` Ionela Voinescu
3 siblings, 0 replies; 29+ messages in thread
From: Ionela Voinescu @ 2023-09-21 10:12 UTC (permalink / raw)
To: Vincent Guittot
Cc: linux, catalin.marinas, will, paul.walmsley, palmer, aou,
sudeep.holla, gregkh, rafael, mingo, peterz, juri.lelli,
dietmar.eggemann, rostedt, bsegall, mgorman, bristot, vschneid,
viresh.kumar, linux-arm-kernel, linux-kernel, linux-riscv,
linux-pm, conor.dooley, suagrfillet, ajones, lftan
On Friday 01 Sep 2023 at 15:03:12 (+0200), Vincent Guittot wrote:
> The last item of a performance domain is not always the performance point
> that has been used to compute CPU's capacity. This can lead to different
> target frequency compared with other part of the system like schedutil and
> would result in wrong energy estimation.
>
> a new arch_scale_freq_ref() is available to return a fixed and coherent
> frequency reference that can be used when computing the CPU's frequency
> for an level of utilization. Use this function when available or fallback
> to the last performance domain item otherwise.
>
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> ---
> include/linux/energy_model.h | 20 +++++++++++++++++---
> 1 file changed, 17 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h
> index b9caa01dfac4..7ee07be6928e 100644
> --- a/include/linux/energy_model.h
> +++ b/include/linux/energy_model.h
> @@ -204,6 +204,20 @@ struct em_perf_state *em_pd_get_efficient_state(struct em_perf_domain *pd,
> return ps;
> }
>
> +#ifdef arch_scale_freq_ref
> +static __always_inline
> +unsigned long arch_scale_freq_ref_em(int cpu, struct em_perf_domain *pd)
The comments in patch 3/4 should be considered for this function and its
use as well.
Thanks,
Ionela.
> +{
> + return arch_scale_freq_ref(cpu);
> +}
> +#else
> +static __always_inline
> +unsigned long arch_scale_freq_ref_em(int cpu, struct em_perf_domain *pd)
> +{
> + return pd->table[pd->nr_perf_states - 1].frequency;
> +}
> +#endif
> +
> /**
> * em_cpu_energy() - Estimates the energy consumed by the CPUs of a
> * performance domain
> @@ -224,7 +238,7 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd,
> unsigned long max_util, unsigned long sum_util,
> unsigned long allowed_cpu_cap)
> {
> - unsigned long freq, scale_cpu;
> + unsigned long freq, ref_freq, scale_cpu;
> struct em_perf_state *ps;
> int cpu;
>
> @@ -241,11 +255,11 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd,
> */
> cpu = cpumask_first(to_cpumask(pd->cpus));
> scale_cpu = arch_scale_cpu_capacity(cpu);
> - ps = &pd->table[pd->nr_perf_states - 1];
> + ref_freq = arch_scale_freq_ref_em(cpu, pd);
>
> max_util = map_util_perf(max_util);
> max_util = min(max_util, allowed_cpu_cap);
> - freq = map_util_freq(max_util, ps->frequency, scale_cpu);
> + freq = map_util_freq(max_util, ref_freq, scale_cpu);
>
> /*
> * Find the lowest performance state of the Energy Model above the
> --
> 2.34.1
>
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 0/4] consolidate and cleanup CPU capacity
2023-09-21 10:12 ` [PATCH 0/4] consolidate and cleanup CPU capacity Ionela Voinescu
@ 2023-09-25 12:05 ` Vincent Guittot
0 siblings, 0 replies; 29+ messages in thread
From: Vincent Guittot @ 2023-09-25 12:05 UTC (permalink / raw)
To: Ionela Voinescu
Cc: linux, catalin.marinas, will, paul.walmsley, palmer, aou,
sudeep.holla, gregkh, rafael, mingo, peterz, juri.lelli,
dietmar.eggemann, rostedt, bsegall, mgorman, bristot, vschneid,
viresh.kumar, linux-arm-kernel, linux-kernel, linux-riscv,
linux-pm, conor.dooley, suagrfillet, ajones, lftan
On Thu, 21 Sept 2023 at 12:12, Ionela Voinescu <ionela.voinescu@arm.com> wrote:
>
> On Friday 01 Sep 2023 at 15:03:08 (+0200), Vincent Guittot wrote:
> > This is the 1st part of consolidating how the max compute capacity is
> > used in the scheduler and how we calculate the frequency for a level of
> > utilization.
> >
> > Fix some unconsistancy when computing frequency for an utilization. There
> > can be a mismatch between energy model and schedutil.
>
> There are a few more pieces of functionality that would be worth
> consolidating in this set as well, if you'd like to consider them:
>
> - arch_set_freq_scale() still uses policy->cpuinfo.max_freq. It might be
> good to use the boot time stored max_freq here as well. Given that
> arch_scale_cpu_capacity() would be based on that stored value, if
> arch_scale_freq_capacity() ends up using a different value, it could
> have interesting effects on the utilization signals in case of
> boosting.
That's a good point, I will have a look at this part too. From a quick
look, this seems to be only used by architecture that are using
arch_topology.c too
>
> - As Pierre mentioned in a previous comment, there is already a
> cpufreq_get_hw_max_freq() weak function that returns
> policy->cpuinfo.max_freq and it's only used at boot time by
> the setup code for AMU use for frequency invariance. I'm tempted to
> suggest to use this to initialize what is now "freq_factor" as my
> intention when I created that function was to provide platform
> providers with the possibility to implement their own and decide on
> the frequency they choose as their maximum. This could have been an
> arch_ function as well, but as you mentioned before, mobile and server
> platforms might want to choose different maximum values even if they
> are using the same architecture.
>
> Thanks,
> Ionela.
>
> >
> > Next step will be to make a difference between the original
> > max compute capacity of a CPU and what is currently available when
> > there is a capping applying forever (i.e. seconds or more).
> >
> > Vincent Guittot (4):
> > sched: consolidate and cleanup access to CPU's max compute capacity
> > topology: add a new arch_scale_freq_reference
> > cpufreq/schedutil: use a fixed reference frequency
> > energy_model: use a fixed reference frequency
> >
> > arch/arm/include/asm/topology.h | 1 +
> > arch/arm64/include/asm/topology.h | 1 +
> > arch/riscv/include/asm/topology.h | 1 +
> > drivers/base/arch_topology.c | 9 +++------
> > include/linux/arch_topology.h | 7 +++++++
> > include/linux/energy_model.h | 20 +++++++++++++++++---
> > kernel/sched/core.c | 2 +-
> > kernel/sched/cpudeadline.c | 2 +-
> > kernel/sched/cpufreq_schedutil.c | 29 +++++++++++++++++++++++++++--
> > kernel/sched/deadline.c | 4 ++--
> > kernel/sched/fair.c | 18 ++++++++----------
> > kernel/sched/rt.c | 2 +-
> > kernel/sched/sched.h | 6 ------
> > kernel/sched/topology.c | 7 +++++--
> > 14 files changed, 75 insertions(+), 34 deletions(-)
> >
> > --
> > 2.34.1
> >
> >
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 2/4] topology: add a new arch_scale_freq_reference
2023-09-21 9:00 ` Ionela Voinescu
@ 2023-09-25 12:06 ` Vincent Guittot
0 siblings, 0 replies; 29+ messages in thread
From: Vincent Guittot @ 2023-09-25 12:06 UTC (permalink / raw)
To: Ionela Voinescu
Cc: linux, catalin.marinas, will, paul.walmsley, palmer, aou,
sudeep.holla, gregkh, rafael, mingo, peterz, juri.lelli,
dietmar.eggemann, rostedt, bsegall, mgorman, bristot, vschneid,
viresh.kumar, linux-arm-kernel, linux-kernel, linux-riscv,
linux-pm, conor.dooley, suagrfillet, ajones, lftan
On Thu, 21 Sept 2023 at 11:00, Ionela Voinescu <ionela.voinescu@arm.com> wrote:
>
> Hi Vincent,
>
> On Friday 01 Sep 2023 at 15:03:10 (+0200), Vincent Guittot wrote:
> > Create a new method to get a unique and fixed max frequency. Currently
> > cpuinfo.max_freq or last item of performance domain are used as the max
> > frequency when computing the frequency for a level of utilization but:
> > - cpuinfo_max_freq can change at runtime. boost is one example of
> > such change.
> > - cpuinfo.max_freq and last item of the PD can be different leading to
> > different results betwen cpufreq and energy model.
> >
> > We need to save the max frequency that has been used when computing the
> > CPUs capacity and use this fixed and coherent value to convert between
> > frequency and CPU's capacity.
> >
> > In fact, we already save the frequency that has been used when computing
> > the capacity of each CPU. We extend the precision to save khZ instead of
> > Mhz currently and we modify the type to be aligned with other variables
> > used when converting frequency to capacity and the other way.
> >
> > Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> > ---
> > arch/arm/include/asm/topology.h | 1 +
> > arch/arm64/include/asm/topology.h | 1 +
> > arch/riscv/include/asm/topology.h | 1 +
> > drivers/base/arch_topology.c | 9 +++------
> > include/linux/arch_topology.h | 7 +++++++
> > 5 files changed, 13 insertions(+), 6 deletions(-)
> >
> > diff --git a/arch/arm/include/asm/topology.h b/arch/arm/include/asm/topology.h
> > index c7d2510e5a78..853c4f81ba4a 100644
> > --- a/arch/arm/include/asm/topology.h
> > +++ b/arch/arm/include/asm/topology.h
> > @@ -13,6 +13,7 @@
> > #define arch_set_freq_scale topology_set_freq_scale
> > #define arch_scale_freq_capacity topology_get_freq_scale
> > #define arch_scale_freq_invariant topology_scale_freq_invariant
> > +#define arch_scale_freq_ref topology_get_freq_ref
>
> The "reference" frequency or performance of a CPU has a specific meaning
> in the ACPI specification as a fixed frequency of a constant clock and its
> associated performance level, usually used for performance to frequency
> conversions. This is not guaranteed to be the maximum
> performance/frequency and it's definitely not the case for arm systems
> where this is tied to the system timer - 1-50Mhz.
>
> So I believe it might create some confusion to call this "reference"
> frequency.
>
> Any reason for not calling it arch_scale_freq_max? I know not all max
> frequencies are equal :) but it would still drive the point that this is
> intended to act as a chosen maximum for the scheduler's use, especially
> if there's a well place comment.
I don't like max_freq because most will assume this is the cpufreq max
value whereas this value is the actual frequency that has been used as
a reference for computing the cpu capacity and should be used as the
reference for any further computation
>
> > #endif
> >
> > /* Replace task scheduler's default cpu-invariant accounting */
> > diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h
> > index 9fab663dd2de..a323b109b9c4 100644
> > --- a/arch/arm64/include/asm/topology.h
> > +++ b/arch/arm64/include/asm/topology.h
> > @@ -23,6 +23,7 @@ void update_freq_counters_refs(void);
> > #define arch_set_freq_scale topology_set_freq_scale
> > #define arch_scale_freq_capacity topology_get_freq_scale
> > #define arch_scale_freq_invariant topology_scale_freq_invariant
> > +#define arch_scale_freq_ref topology_get_freq_ref
> >
> > #ifdef CONFIG_ACPI_CPPC_LIB
> > #define arch_init_invariance_cppc topology_init_cpu_capacity_cppc
> > diff --git a/arch/riscv/include/asm/topology.h b/arch/riscv/include/asm/topology.h
> > index e316ab3b77f3..61183688bdd5 100644
> > --- a/arch/riscv/include/asm/topology.h
> > +++ b/arch/riscv/include/asm/topology.h
> > @@ -9,6 +9,7 @@
> > #define arch_set_freq_scale topology_set_freq_scale
> > #define arch_scale_freq_capacity topology_get_freq_scale
> > #define arch_scale_freq_invariant topology_scale_freq_invariant
> > +#define arch_scale_freq_ref topology_get_freq_ref
> >
> > /* Replace task scheduler's default cpu-invariant accounting */
> > #define arch_scale_cpu_capacity topology_get_cpu_scale
> > diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> > index b741b5ba82bd..75fa67477a9d 100644
> > --- a/drivers/base/arch_topology.c
> > +++ b/drivers/base/arch_topology.c
> > @@ -26,7 +26,7 @@
> > static DEFINE_PER_CPU(struct scale_freq_data __rcu *, sft_data);
> > static struct cpumask scale_freq_counters_mask;
> > static bool scale_freq_invariant;
> > -static DEFINE_PER_CPU(u32, freq_factor) = 1;
> > +DEFINE_PER_CPU(unsigned long, freq_factor) = 1;
>
> Given its new wider use, it might be good for this to get a more
> relevant name as well. Previously freq_factor made sense for its role
> in using the dmips/mhz values to obtain CPU capacities. But if this is
> now returned from arch_scale_freq_ref() it would be more difficult still
> to easily understand what this value is supposed to reflect, when
> reading the function name or the per-cpu variable name alone.
As long as it was used internally behind a function i didn't see a
problem but I will probably have cppc_cpufreq to set it while
registering its Energy model so I will change it
>
> Thanks,
> Ionela.
>
> >
> > static bool supports_scale_freq_counters(const struct cpumask *cpus)
> > {
> > @@ -183,10 +183,7 @@ void topology_update_thermal_pressure(const struct cpumask *cpus,
> >
> > cpu = cpumask_first(cpus);
> > max_capacity = arch_scale_cpu_capacity(cpu);
> > - max_freq = per_cpu(freq_factor, cpu);
> > -
> > - /* Convert to MHz scale which is used in 'freq_factor' */
> > - capped_freq /= 1000;
> > + max_freq = arch_scale_freq_ref(cpu);
> >
> > /*
> > * Handle properly the boost frequencies, which should simply clean
> > @@ -411,7 +408,7 @@ init_cpu_capacity_callback(struct notifier_block *nb,
> > cpumask_andnot(cpus_to_visit, cpus_to_visit, policy->related_cpus);
> >
> > for_each_cpu(cpu, policy->related_cpus)
> > - per_cpu(freq_factor, cpu) = policy->cpuinfo.max_freq / 1000;
> > + per_cpu(freq_factor, cpu) = policy->cpuinfo.max_freq;
> >
> > if (cpumask_empty(cpus_to_visit)) {
> > topology_normalize_cpu_scale();
> > diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
> > index a07b510e7dc5..7a2dba9c3dc0 100644
> > --- a/include/linux/arch_topology.h
> > +++ b/include/linux/arch_topology.h
> > @@ -27,6 +27,13 @@ static inline unsigned long topology_get_cpu_scale(int cpu)
> >
> > void topology_set_cpu_scale(unsigned int cpu, unsigned long capacity);
> >
> > +DECLARE_PER_CPU(unsigned long, freq_factor);
> > +
> > +static inline unsigned long topology_get_freq_ref(int cpu)
> > +{
> > + return per_cpu(freq_factor, cpu);
> > +}
> > +
> > DECLARE_PER_CPU(unsigned long, arch_freq_scale);
> >
> > static inline unsigned long topology_get_freq_scale(int cpu)
> > --
> > 2.34.1
> >
> >
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 3/4] cpufreq/schedutil: use a fixed reference frequency
2023-09-21 9:19 ` Ionela Voinescu
@ 2023-09-25 12:06 ` Vincent Guittot
0 siblings, 0 replies; 29+ messages in thread
From: Vincent Guittot @ 2023-09-25 12:06 UTC (permalink / raw)
To: Ionela Voinescu
Cc: linux, catalin.marinas, will, paul.walmsley, palmer, aou,
sudeep.holla, gregkh, rafael, mingo, peterz, juri.lelli,
dietmar.eggemann, rostedt, bsegall, mgorman, bristot, vschneid,
viresh.kumar, linux-arm-kernel, linux-kernel, linux-riscv,
linux-pm, conor.dooley, suagrfillet, ajones, lftan
On Thu, 21 Sept 2023 at 11:19, Ionela Voinescu <ionela.voinescu@arm.com> wrote:
>
> On Friday 01 Sep 2023 at 15:03:11 (+0200), Vincent Guittot wrote:
> > cpuinfo_max_freq can change at runtime because of boost as example. This
> > implies that the value could not be the same than the one that has been
> > used when computing the capacity of a CPU.
> >
> > The new arch_scale_freq_ref() returns a fixed and coherent frequency
> > reference that can be used when computing a frequency based on utilization.
> >
> > Use this arch_scale_freq_ref() when available and fallback to
> > cpuinfo.max_freq otherwise.
> >
> > Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> > ---
> > kernel/sched/cpufreq_schedutil.c | 29 +++++++++++++++++++++++++++--
> > 1 file changed, 27 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> > index 4492608b7d7f..9996ef429e2b 100644
> > --- a/kernel/sched/cpufreq_schedutil.c
> > +++ b/kernel/sched/cpufreq_schedutil.c
> > @@ -114,6 +114,31 @@ static void sugov_deferred_update(struct sugov_policy *sg_policy)
> > }
> > }
> >
> > +#ifdef arch_scale_freq_ref
> > +/**
> > + * arch_scale_freq_ref_policy - get the reference frequency of a given CPU that
> > + * has been used to correlate frequency and compute capacity.
> > + * @cpu: the CPU in question.
> > + *
> > + * Return: the reference CPU frequency.
> > + */
> > +static __always_inline
> > +unsigned long arch_scale_freq_ref_policy(struct cpufreq_policy *policy)
>
> This should not be an arch_ function as it's only a wrapper over an
> arch_ function and not a function that different architectures might
> implement differently usually in architecture specific code.
I expect this function to disappear at some point once all arch will
use it that why I named it arch_* but I can rename it
>
> > +{
> > + return arch_scale_freq_ref(policy->cpu);
>
> It might make it easier to read if arch_scale_freq_ref() had a default
> implementation that returned 0.
I will use the suggestion made by Peter to have only one function
>
> Then this code would simply become:
>
> freq = arch_scale_freq_ref(policy->cpu);
> if (freq)
> return freq;
> else if (arch_scale_freq_invariant())
> return ..
> ..
>
> This approach is similar to the use of arch_freq_get_on_cpu() in
> cpufreq.c, and, as there, having a chosen maximum frequency of 0 would
> not be a valid value.
>
> > +}
> > +#else
> > +static __always_inline
> > +unsigned long arch_scale_freq_ref_policy(struct cpufreq_policy *policy)
> > +{
> > + if (arch_scale_freq_invariant())
> > + return policy->cpuinfo.max_freq;
> > +
> > +
> > + return policy->cur;
> > +}
> > +#endif
> > +
> > /**
> > * get_next_freq - Compute a new frequency for a given cpufreq policy.
> > * @sg_policy: schedutil policy object to compute the new frequency for.
> > @@ -139,11 +164,11 @@ static void sugov_deferred_update(struct sugov_policy *sg_policy)
> > static unsigned int get_next_freq(struct sugov_policy *sg_policy,
> > unsigned long util, unsigned long max)
> > {
> > + unsigned int freq;
> > struct cpufreq_policy *policy = sg_policy->policy;
> > - unsigned int freq = arch_scale_freq_invariant() ?
> > - policy->cpuinfo.max_freq : policy->cur;
> >
> > util = map_util_perf(util);
> > + freq = arch_scale_freq_ref_policy(policy);
>
> Given its single use here, it would likely be better to place the code
> above directly here, rather than create a wrapper over a few lines of
> code.
>
> Hope it helps,
> Ionela.
>
> > freq = map_util_freq(util, freq, max);
> >
> > if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update)
> > --
> > 2.34.1
> >
> >
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2023-09-25 12:07 UTC | newest]
Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-01 13:03 [PATCH 0/4] consolidate and cleanup CPU capacity Vincent Guittot
2023-09-01 13:03 ` [PATCH 1/4] sched: consolidate and cleanup access to CPU's max compute capacity Vincent Guittot
2023-09-05 11:25 ` Peter Zijlstra
2023-09-14 20:45 ` Dietmar Eggemann
2023-09-15 13:20 ` Vincent Guittot
2023-09-01 13:03 ` [PATCH 2/4] topology: add a new arch_scale_freq_reference Vincent Guittot
2023-09-04 12:35 ` Lukasz Luba
2023-09-18 11:23 ` Vincent Guittot
2023-09-14 21:01 ` Dietmar Eggemann
2023-09-21 9:00 ` Ionela Voinescu
2023-09-25 12:06 ` Vincent Guittot
2023-09-01 13:03 ` [PATCH 3/4] cpufreq/schedutil: use a fixed reference frequency Vincent Guittot
2023-09-02 10:57 ` Conor Dooley
2023-09-02 12:49 ` Vincent Guittot
2023-09-05 11:24 ` Peter Zijlstra
2023-09-05 13:50 ` Vincent Guittot
2023-09-21 9:19 ` Ionela Voinescu
2023-09-25 12:06 ` Vincent Guittot
2023-09-01 13:03 ` [PATCH 4/4] energy_model: " Vincent Guittot
2023-09-04 12:40 ` Lukasz Luba
2023-09-05 10:05 ` Pierre Gondois
2023-09-05 11:33 ` Peter Zijlstra
2023-09-05 13:16 ` Vincent Guittot
2023-09-14 21:07 ` Dietmar Eggemann
2023-09-15 13:35 ` Vincent Guittot
2023-09-18 20:46 ` Dietmar Eggemann
2023-09-21 10:12 ` Ionela Voinescu
2023-09-21 10:12 ` [PATCH 0/4] consolidate and cleanup CPU capacity Ionela Voinescu
2023-09-25 12:05 ` Vincent Guittot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).