* [PATCH v8 0/4] Add support for AArch64 AMUv1-based average freq
@ 2024-12-06 13:55 Beata Michalska
2024-12-06 13:55 ` [PATCH v8 1/4] cpufreq: Introduce an optional cpuinfo_avg_freq sysfs entry Beata Michalska
0 siblings, 1 reply; 13+ messages in thread
From: Beata Michalska @ 2024-12-06 13:55 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel, linux-pm, ionela.voinescu,
sudeep.holla, will, catalin.marinas, rafael, viresh.kumar
Cc: sumitg, yang, vanshikonda, lihuisong, zhanjie9, Jonathan Corbet,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
H . Peter Anvin, Phil Auld, x86, linux-doc
Hi All,
This series adds support for obtaining an average CPU frequency based on
a hardware provided feedback. The average frequency is being exposed via
dedicated yet optional cpufreq sysfs attribute - cpuinfo_avg_freq.
The architecture specific bits are being provided for AArch64, caching on
existing implementation for FIE and AMUv1 support: the frequency scale
factor, updated on each sched tick, serving as a base for retrieving
the frequency for a given CPU, representing an average frequency
reported between the ticks.
The changes have been rather lightly (due to some limitations) tested on
an FVP model.
Note that [PATCH 2/4] arm64: amu: Delay allocating cpumask for AMU FIE support
can be merged independently.
Additionally, this series depends on [6]
Relevant discussions:
[1] https://lore.kernel.org/all/20240229162520.970986-1-vanshikonda@os.amperecomputing.com/
[2] https://lore.kernel.org/all/7eozim2xnepacnnkzxlbx34hib4otycnbn4dqymfziqou5lw5u@5xzpv3t7sxo3/
[3] https://lore.kernel.org/all/20231212072617.14756-1-lihuisong@huawei.com/
[4] https://lore.kernel.org/lkml/ZIHpd6unkOtYVEqP@e120325.cambridge.arm.com/T/#m4e74cb5a0aaa353c60fedc6cfb95ab7a6e381e3c
[5] https://lore.kernel.org/all/20240603081331.3829278-1-beata.michalska@arm.com/
[6] https://lore.kernel.org/all/20240827154818.1195849-1-ionela.voinescu@arm.com/
v8:
- Drop introducing new function and reuse arch_freq_get_on_cpu, guarding its use
in scaling_cur_freq sysfs handler with dedicated config for x86
v7:
- Dropping 'arch_topology: init capacity_freq_ref to 0' patch from the series
as this one has been sent separately as an independent change
[https://lore.kernel.org/all/20240827154818.1195849-1-ionela.voinescu@arm.com/]
- Including in the series change that introduces new sysfs entry [PATCH 1/4]
- Consequently modifying previously arch_freq_get_on_cpu to match reqs for new
sysfs attribute
- Dropping an idea of considering a CPU that has been idle for a while as a
valid source of information for obtaining an AMU-counter based frequency
- Some minor cosmetic changes
v6:
- delay allocating cpumask for AMU FIE support instead of invalidating the mask
upon failure to register cpufreq policy notifications
- drop the change to cpufreq core (for cpuinfo_cur_freq) as this one will be
sent as a separate change
v5:
- Fix invalid access to cpumask
- Reworked finding reference cpu when getting the freq
v4:
- dropping seqcount
- fixing identifying active cpu within given policy
- skipping full dynticks cpus when retrieving the freq
- bringing back plugging in arch_freq_get_on_cpu into cpuinfo_cur_freq
v3:
- dropping changes to cpufreq_verify_current_freq
- pulling in changes from Ionela initializing capacity_freq_ref to 0
(thanks for that!) and applying suggestions made by her during last review:
- switching to arch_scale_freq_capacity and arch_scale_freq_ref when
reversing freq scale factor computation
- swapping shift with multiplication
- adding time limit for considering last scale update as valid
- updating frequency scale factor upon entering idle
v2:
- Splitting the patches
- Adding comment for full dyntick mode
- Plugging arch_freq_get_on_cpu into cpufreq_verify_current_freq instead
of in show_cpuinfo_cur_freq to allow the framework to stay more in sync
with potential freq changes
CC: Jonathan Corbet <corbet@lwn.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: Dave Hansen <dave.hansen@linux.intel.com>
CC: H. Peter Anvin <hpa@zytor.com>
CC: Phil Auld <pauld@redhat.com>
CC: x86@kernel.org
CC: linux-doc@vger.kernel.org
Beata Michalska (4):
cpufreq: Introduce an optional cpuinfo_avg_freq sysfs entry
arm64: amu: Delay allocating cpumask for AMU FIE support
arm64: Provide an AMU-based version of arch_freq_get_on_cpu
arm64: Update AMU-based freq scale factor on entering idle
Documentation/admin-guide/pm/cpufreq.rst | 16 ++-
arch/arm64/kernel/topology.c | 144 +++++++++++++++++++----
arch/x86/kernel/cpu/aperfmperf.c | 2 +-
arch/x86/kernel/cpu/proc.c | 7 +-
drivers/cpufreq/Kconfig.x86 | 12 ++
drivers/cpufreq/cpufreq.c | 36 +++++-
include/linux/cpufreq.h | 2 +-
7 files changed, 188 insertions(+), 31 deletions(-)
--
2.25.1
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v8 1/4] cpufreq: Introduce an optional cpuinfo_avg_freq sysfs entry
2024-12-06 13:55 [PATCH v8 0/4] Add support for AArch64 AMUv1-based average freq Beata Michalska
@ 2024-12-06 13:55 ` Beata Michalska
2024-12-12 6:51 ` Viresh Kumar
2024-12-16 5:43 ` Kai-Heng Feng
0 siblings, 2 replies; 13+ messages in thread
From: Beata Michalska @ 2024-12-06 13:55 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel, linux-pm, ionela.voinescu,
sudeep.holla, will, catalin.marinas, rafael, viresh.kumar
Cc: sumitg, yang, vanshikonda, lihuisong, zhanjie9, Jonathan Corbet,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
H . Peter Anvin, Phil Auld, x86, linux-doc
Currently the CPUFreq core exposes two sysfs attributes that can be used
to query current frequency of a given CPU(s): namely cpuinfo_cur_freq
and scaling_cur_freq. Both provide slightly different view on the
subject and they do come with their own drawbacks.
cpuinfo_cur_freq provides higher precision though at a cost of being
rather expensive. Moreover, the information retrieved via this attribute
is somewhat short lived as frequency can change at any point of time
making it difficult to reason from.
scaling_cur_freq, on the other hand, tends to be less accurate but then
the actual level of precision (and source of information) varies between
architectures making it a bit ambiguous.
The new attribute, cpuinfo_avg_freq, is intended to provide more stable,
distinct interface, exposing an average frequency of a given CPU(s), as
reported by the hardware, over a time frame spanning no more than a few
milliseconds. As it requires appropriate hardware support, this
interface is optional.
Note that under the hood, the new attribute relies on the information
provided by arch_freq_get_on_cpu, which, up to this point, has been
feeding data for scaling_cur_freq attribute, being the source of
ambiguity when it comes to interpretation. This has been amended by
restoring the intended behavior for scaling_cur_freq, with a new
dedicated config option to maintain status quo for those, who may need
it.
CC: Jonathan Corbet <corbet@lwn.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: Dave Hansen <dave.hansen@linux.intel.com>
CC: H. Peter Anvin <hpa@zytor.com>
CC: Phil Auld <pauld@redhat.com>
CC: x86@kernel.org
CC: linux-doc@vger.kernel.org
Signed-off-by: Beata Michalska <beata.michalska@arm.com>
---
Documentation/admin-guide/pm/cpufreq.rst | 16 ++++++++++-
arch/x86/kernel/cpu/aperfmperf.c | 2 +-
arch/x86/kernel/cpu/proc.c | 7 +++--
drivers/cpufreq/Kconfig.x86 | 12 ++++++++
drivers/cpufreq/cpufreq.c | 36 +++++++++++++++++++++---
include/linux/cpufreq.h | 2 +-
6 files changed, 66 insertions(+), 9 deletions(-)
diff --git a/Documentation/admin-guide/pm/cpufreq.rst b/Documentation/admin-guide/pm/cpufreq.rst
index fe1be4ad88cb..76f3835afe01 100644
--- a/Documentation/admin-guide/pm/cpufreq.rst
+++ b/Documentation/admin-guide/pm/cpufreq.rst
@@ -248,6 +248,19 @@ are the following:
If that frequency cannot be determined, this attribute should not
be present.
+``cpuinfo_avg_freq``
+ An average frequency (in KHz) of all CPUs belonging to a given policy,
+ derived from a hardware provided feedback and reported on a time frame
+ spanning at most few milliseconds.
+
+ This is expected to be based on the frequency the hardware actually runs
+ at and, as such, might require specialised hardware support (such as AMU
+ extension on ARM). If one cannot be determined, this attribute should
+ not be present.
+
+ Note, that failed attempt to retrieve current frequency for a given
+ CPU(s) will result in an appropriate error.
+
``cpuinfo_max_freq``
Maximum possible operating frequency the CPUs belonging to this policy
can run at (in kHz).
@@ -293,7 +306,8 @@ are the following:
Some architectures (e.g. ``x86``) may attempt to provide information
more precisely reflecting the current CPU frequency through this
attribute, but that still may not be the exact current CPU frequency as
- seen by the hardware at the moment.
+ seen by the hardware at the moment. This behavior though, is only
+ available via c:macro:``CPUFREQ_ARCH_CUR_FREQ`` option.
``scaling_driver``
The scaling driver currently in use.
diff --git a/arch/x86/kernel/cpu/aperfmperf.c b/arch/x86/kernel/cpu/aperfmperf.c
index 0b69bfbf345d..a00059139ca4 100644
--- a/arch/x86/kernel/cpu/aperfmperf.c
+++ b/arch/x86/kernel/cpu/aperfmperf.c
@@ -413,7 +413,7 @@ void arch_scale_freq_tick(void)
*/
#define MAX_SAMPLE_AGE ((unsigned long)HZ / 50)
-unsigned int arch_freq_get_on_cpu(int cpu)
+int arch_freq_get_on_cpu(int cpu)
{
struct aperfmperf *s = per_cpu_ptr(&cpu_samples, cpu);
unsigned int seq, freq;
diff --git a/arch/x86/kernel/cpu/proc.c b/arch/x86/kernel/cpu/proc.c
index e65fae63660e..34d8fb93fb70 100644
--- a/arch/x86/kernel/cpu/proc.c
+++ b/arch/x86/kernel/cpu/proc.c
@@ -86,9 +86,12 @@ static int show_cpuinfo(struct seq_file *m, void *v)
seq_printf(m, "microcode\t: 0x%x\n", c->microcode);
if (cpu_has(c, X86_FEATURE_TSC)) {
- unsigned int freq = arch_freq_get_on_cpu(cpu);
+ int freq = arch_freq_get_on_cpu(cpu);
- seq_printf(m, "cpu MHz\t\t: %u.%03u\n", freq / 1000, (freq % 1000));
+ if (freq <= 0)
+ seq_puts(m, "cpu MHz\t\t: Unknown\n");
+ else
+ seq_printf(m, "cpu MHz\t\t: %u.%03u\n", freq / 1000, (freq % 1000));
}
/* Cache size */
diff --git a/drivers/cpufreq/Kconfig.x86 b/drivers/cpufreq/Kconfig.x86
index 97c2d4f15d76..212e1b9afe21 100644
--- a/drivers/cpufreq/Kconfig.x86
+++ b/drivers/cpufreq/Kconfig.x86
@@ -340,3 +340,15 @@ config X86_SPEEDSTEP_RELAXED_CAP_CHECK
option lets the probing code bypass some of those checks if the
parameter "relaxed_check=1" is passed to the module.
+config CPUFREQ_ARCH_CUR_FREQ
+ default y
+ bool "Current frequency derived from HW provided feedback"
+ help
+ This determines whether the scaling_cur_freq sysfs attribute returns
+ the last requested frequency or a more precise value based on hardware
+ provided feedback (as architected counters).
+ Given that a more precise frequency can now be provided via the
+ cpuinfo_avg_cur_freq attribute, by enabling this option,
+ scaling_cur_freq maintains the provision of a counter based frequency,
+ for compatibility reasons.
+
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 04fc786dd2c0..70df2a24437b 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -747,9 +747,14 @@ show_one(cpuinfo_transition_latency, cpuinfo.transition_latency);
show_one(scaling_min_freq, min);
show_one(scaling_max_freq, max);
-__weak unsigned int arch_freq_get_on_cpu(int cpu)
+__weak int arch_freq_get_on_cpu(int cpu)
{
- return 0;
+ return -EOPNOTSUPP;
+}
+
+static inline bool cpufreq_avg_freq_supported(struct cpufreq_policy *policy)
+{
+ return arch_freq_get_on_cpu(policy->cpu) != -EOPNOTSUPP;
}
static ssize_t show_scaling_cur_freq(struct cpufreq_policy *policy, char *buf)
@@ -757,8 +762,11 @@ static ssize_t show_scaling_cur_freq(struct cpufreq_policy *policy, char *buf)
ssize_t ret;
unsigned int freq;
- freq = arch_freq_get_on_cpu(policy->cpu);
- if (freq)
+ freq = IS_ENABLED(CONFIG_CPUFREQ_ARCH_CUR_FREQ)
+ ? arch_freq_get_on_cpu(policy->cpu)
+ : 0;
+
+ if (freq > 0)
ret = sysfs_emit(buf, "%u\n", freq);
else if (cpufreq_driver->setpolicy && cpufreq_driver->get)
ret = sysfs_emit(buf, "%u\n", cpufreq_driver->get(policy->cpu));
@@ -802,6 +810,19 @@ static ssize_t show_cpuinfo_cur_freq(struct cpufreq_policy *policy,
return sysfs_emit(buf, "<unknown>\n");
}
+/*
+ * show_cpuinfo_avg_freq - average CPU frequency as detected by hardware
+ */
+static ssize_t show_cpuinfo_avg_freq(struct cpufreq_policy *policy,
+ char *buf)
+{
+ int avg_freq = arch_freq_get_on_cpu(policy->cpu);
+
+ if (avg_freq > 0)
+ return sysfs_emit(buf, "%u\n", avg_freq);
+ return avg_freq != 0 ? avg_freq : -EINVAL;
+}
+
/*
* show_scaling_governor - show the current policy for the specified CPU
*/
@@ -964,6 +985,7 @@ static ssize_t show_bios_limit(struct cpufreq_policy *policy, char *buf)
}
cpufreq_freq_attr_ro_perm(cpuinfo_cur_freq, 0400);
+cpufreq_freq_attr_ro(cpuinfo_avg_freq);
cpufreq_freq_attr_ro(cpuinfo_min_freq);
cpufreq_freq_attr_ro(cpuinfo_max_freq);
cpufreq_freq_attr_ro(cpuinfo_transition_latency);
@@ -1091,6 +1113,12 @@ static int cpufreq_add_dev_interface(struct cpufreq_policy *policy)
return ret;
}
+ if (cpufreq_avg_freq_supported(policy)) {
+ ret = sysfs_create_file(&policy->kobj, &cpuinfo_avg_freq.attr);
+ if (ret)
+ return ret;
+ }
+
ret = sysfs_create_file(&policy->kobj, &scaling_cur_freq.attr);
if (ret)
return ret;
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index d4d2f4d1d7cb..a7b6c0ccf9bc 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -1194,7 +1194,7 @@ static inline int of_perf_domain_get_sharing_cpumask(int pcpu, const char *list_
}
#endif
-extern unsigned int arch_freq_get_on_cpu(int cpu);
+extern int arch_freq_get_on_cpu(int cpu);
#ifndef arch_set_freq_scale
static __always_inline
--
2.25.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v8 1/4] cpufreq: Introduce an optional cpuinfo_avg_freq sysfs entry
2024-12-06 13:55 ` [PATCH v8 1/4] cpufreq: Introduce an optional cpuinfo_avg_freq sysfs entry Beata Michalska
@ 2024-12-12 6:51 ` Viresh Kumar
2024-12-16 22:15 ` Beata Michalska
2024-12-16 5:43 ` Kai-Heng Feng
1 sibling, 1 reply; 13+ messages in thread
From: Viresh Kumar @ 2024-12-12 6:51 UTC (permalink / raw)
To: Beata Michalska
Cc: linux-kernel, linux-arm-kernel, linux-pm, ionela.voinescu,
sudeep.holla, will, catalin.marinas, rafael, sumitg, yang,
vanshikonda, lihuisong, zhanjie9, Jonathan Corbet,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
H . Peter Anvin, Phil Auld, x86, linux-doc
On 06-12-24, 13:55, Beata Michalska wrote:
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 04fc786dd2c0..70df2a24437b 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -747,9 +747,14 @@ show_one(cpuinfo_transition_latency, cpuinfo.transition_latency);
> show_one(scaling_min_freq, min);
> show_one(scaling_max_freq, max);
>
> -__weak unsigned int arch_freq_get_on_cpu(int cpu)
> +__weak int arch_freq_get_on_cpu(int cpu)
> {
> - return 0;
> + return -EOPNOTSUPP;
I did suggest not doing this as it may not be acceptable.
https://lore.kernel.org/all/CAKohpokFUpQyHYO017kOn-Jbt0CFZ1GuxoG3N-fenWJ_poW=4Q@mail.gmail.com/
--
viresh
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v8 1/4] cpufreq: Introduce an optional cpuinfo_avg_freq sysfs entry
2024-12-06 13:55 ` [PATCH v8 1/4] cpufreq: Introduce an optional cpuinfo_avg_freq sysfs entry Beata Michalska
2024-12-12 6:51 ` Viresh Kumar
@ 2024-12-16 5:43 ` Kai-Heng Feng
2024-12-16 7:11 ` Sumit Gupta
2024-12-16 22:21 ` Beata Michalska
1 sibling, 2 replies; 13+ messages in thread
From: Kai-Heng Feng @ 2024-12-16 5:43 UTC (permalink / raw)
To: Beata Michalska, linux-kernel, linux-arm-kernel, linux-pm,
ionela.voinescu, sudeep.holla, will, catalin.marinas, rafael,
viresh.kumar
Cc: sumitg, yang, vanshikonda, lihuisong, zhanjie9, Jonathan Corbet,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
H . Peter Anvin, Phil Auld, x86, linux-doc, Carol Soto
Hi Beata,
On 2024/12/6 9:55 PM, Beata Michalska wrote:
> Currently the CPUFreq core exposes two sysfs attributes that can be used
> to query current frequency of a given CPU(s): namely cpuinfo_cur_freq
> and scaling_cur_freq. Both provide slightly different view on the
> subject and they do come with their own drawbacks.
>
> cpuinfo_cur_freq provides higher precision though at a cost of being
> rather expensive. Moreover, the information retrieved via this attribute
> is somewhat short lived as frequency can change at any point of time
> making it difficult to reason from.
>
> scaling_cur_freq, on the other hand, tends to be less accurate but then
> the actual level of precision (and source of information) varies between
> architectures making it a bit ambiguous.
>
> The new attribute, cpuinfo_avg_freq, is intended to provide more stable,
> distinct interface, exposing an average frequency of a given CPU(s), as
> reported by the hardware, over a time frame spanning no more than a few
> milliseconds. As it requires appropriate hardware support, this
> interface is optional.
>
> Note that under the hood, the new attribute relies on the information
> provided by arch_freq_get_on_cpu, which, up to this point, has been
> feeding data for scaling_cur_freq attribute, being the source of
> ambiguity when it comes to interpretation. This has been amended by
> restoring the intended behavior for scaling_cur_freq, with a new
> dedicated config option to maintain status quo for those, who may need
> it.
>
> CC: Jonathan Corbet <corbet@lwn.net>
> CC: Thomas Gleixner <tglx@linutronix.de>
> CC: Ingo Molnar <mingo@redhat.com>
> CC: Borislav Petkov <bp@alien8.de>
> CC: Dave Hansen <dave.hansen@linux.intel.com>
> CC: H. Peter Anvin <hpa@zytor.com>
> CC: Phil Auld <pauld@redhat.com>
> CC: x86@kernel.org
> CC: linux-doc@vger.kernel.org
> Signed-off-by: Beata Michalska <beata.michalska@arm.com>
> ---
> Documentation/admin-guide/pm/cpufreq.rst | 16 ++++++++++-
> arch/x86/kernel/cpu/aperfmperf.c | 2 +-
> arch/x86/kernel/cpu/proc.c | 7 +++--
> drivers/cpufreq/Kconfig.x86 | 12 ++++++++
> drivers/cpufreq/cpufreq.c | 36 +++++++++++++++++++++---
> include/linux/cpufreq.h | 2 +-
> 6 files changed, 66 insertions(+), 9 deletions(-)
>
> diff --git a/Documentation/admin-guide/pm/cpufreq.rst b/Documentation/admin-guide/pm/cpufreq.rst
> index fe1be4ad88cb..76f3835afe01 100644
> --- a/Documentation/admin-guide/pm/cpufreq.rst
> +++ b/Documentation/admin-guide/pm/cpufreq.rst
> @@ -248,6 +248,19 @@ are the following:
> If that frequency cannot be determined, this attribute should not
> be present.
>
> +``cpuinfo_avg_freq``
> + An average frequency (in KHz) of all CPUs belonging to a given policy,
> + derived from a hardware provided feedback and reported on a time frame
> + spanning at most few milliseconds.
> +
> + This is expected to be based on the frequency the hardware actually runs
> + at and, as such, might require specialised hardware support (such as AMU
> + extension on ARM). If one cannot be determined, this attribute should
> + not be present.
> +
> + Note, that failed attempt to retrieve current frequency for a given
> + CPU(s) will result in an appropriate error.
> +
> ``cpuinfo_max_freq``
> Maximum possible operating frequency the CPUs belonging to this policy
> can run at (in kHz).
> @@ -293,7 +306,8 @@ are the following:
> Some architectures (e.g. ``x86``) may attempt to provide information
> more precisely reflecting the current CPU frequency through this
> attribute, but that still may not be the exact current CPU frequency as
> - seen by the hardware at the moment.
> + seen by the hardware at the moment. This behavior though, is only
> + available via c:macro:``CPUFREQ_ARCH_CUR_FREQ`` option.
>
> ``scaling_driver``
> The scaling driver currently in use.
> diff --git a/arch/x86/kernel/cpu/aperfmperf.c b/arch/x86/kernel/cpu/aperfmperf.c
> index 0b69bfbf345d..a00059139ca4 100644
> --- a/arch/x86/kernel/cpu/aperfmperf.c
> +++ b/arch/x86/kernel/cpu/aperfmperf.c
> @@ -413,7 +413,7 @@ void arch_scale_freq_tick(void)
> */
> #define MAX_SAMPLE_AGE ((unsigned long)HZ / 50)
>
> -unsigned int arch_freq_get_on_cpu(int cpu)
> +int arch_freq_get_on_cpu(int cpu)
> {
> struct aperfmperf *s = per_cpu_ptr(&cpu_samples, cpu);
> unsigned int seq, freq;
> diff --git a/arch/x86/kernel/cpu/proc.c b/arch/x86/kernel/cpu/proc.c
> index e65fae63660e..34d8fb93fb70 100644
> --- a/arch/x86/kernel/cpu/proc.c
> +++ b/arch/x86/kernel/cpu/proc.c
> @@ -86,9 +86,12 @@ static int show_cpuinfo(struct seq_file *m, void *v)
> seq_printf(m, "microcode\t: 0x%x\n", c->microcode);
>
> if (cpu_has(c, X86_FEATURE_TSC)) {
> - unsigned int freq = arch_freq_get_on_cpu(cpu);
> + int freq = arch_freq_get_on_cpu(cpu);
>
> - seq_printf(m, "cpu MHz\t\t: %u.%03u\n", freq / 1000, (freq % 1000));
> + if (freq <= 0)
> + seq_puts(m, "cpu MHz\t\t: Unknown\n");
> + else
> + seq_printf(m, "cpu MHz\t\t: %u.%03u\n", freq / 1000, (freq % 1000));
> }
>
> /* Cache size */
> diff --git a/drivers/cpufreq/Kconfig.x86 b/drivers/cpufreq/Kconfig.x86
> index 97c2d4f15d76..212e1b9afe21 100644
> --- a/drivers/cpufreq/Kconfig.x86
> +++ b/drivers/cpufreq/Kconfig.x86
> @@ -340,3 +340,15 @@ config X86_SPEEDSTEP_RELAXED_CAP_CHECK
> option lets the probing code bypass some of those checks if the
> parameter "relaxed_check=1" is passed to the module.
>
> +config CPUFREQ_ARCH_CUR_FREQ
> + default y
> + bool "Current frequency derived from HW provided feedback"
> + help
> + This determines whether the scaling_cur_freq sysfs attribute returns
> + the last requested frequency or a more precise value based on hardware
> + provided feedback (as architected counters).
> + Given that a more precise frequency can now be provided via the
> + cpuinfo_avg_cur_freq attribute, by enabling this option,
> + scaling_cur_freq maintains the provision of a counter based frequency,
> + for compatibility reasons.
> +
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 04fc786dd2c0..70df2a24437b 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -747,9 +747,14 @@ show_one(cpuinfo_transition_latency, cpuinfo.transition_latency);
> show_one(scaling_min_freq, min);
> show_one(scaling_max_freq, max);
>
> -__weak unsigned int arch_freq_get_on_cpu(int cpu)
> +__weak int arch_freq_get_on_cpu(int cpu)
> {
> - return 0;
> + return -EOPNOTSUPP;
> +}
> +
> +static inline bool cpufreq_avg_freq_supported(struct cpufreq_policy *policy)
> +{
> + return arch_freq_get_on_cpu(policy->cpu) != -EOPNOTSUPP;
> }
>
> static ssize_t show_scaling_cur_freq(struct cpufreq_policy *policy, char *buf)
> @@ -757,8 +762,11 @@ static ssize_t show_scaling_cur_freq(struct cpufreq_policy *policy, char *buf)
> ssize_t ret;
> unsigned int freq;
>
> - freq = arch_freq_get_on_cpu(policy->cpu);
> - if (freq)
> + freq = IS_ENABLED(CONFIG_CPUFREQ_ARCH_CUR_FREQ)
> + ? arch_freq_get_on_cpu(policy->cpu)
> + : 0;
> +
> + if (freq > 0)
> ret = sysfs_emit(buf, "%u\n", freq);
> else if (cpufreq_driver->setpolicy && cpufreq_driver->get)
> ret = sysfs_emit(buf, "%u\n", cpufreq_driver->get(policy->cpu));
> @@ -802,6 +810,19 @@ static ssize_t show_cpuinfo_cur_freq(struct cpufreq_policy *policy,
> return sysfs_emit(buf, "<unknown>\n");
> }
>
> +/*
> + * show_cpuinfo_avg_freq - average CPU frequency as detected by hardware
> + */
> +static ssize_t show_cpuinfo_avg_freq(struct cpufreq_policy *policy,
> + char *buf)
> +{
> + int avg_freq = arch_freq_get_on_cpu(policy->cpu);
We are seeing issues when reading cpuinfo_avg_freq on an ARM64 system:
$ cat /sys/devices/system/cpu/cpufreq/policy1/cpuinfo_avg_freq
cat: /sys/devices/system/cpu/cpufreq/policy1/cpuinfo_avg_freq: Resource
temporarily unavailable
The CPU is in idle state, so arch_freq_get_on_cpu() can't find a good
alternative source for frequency info.
One way to resolve this is to have fallback methods in show_cpuinfo_avg_freq()
so it will look like this:
static ssize_t show_cpuinfo_avg_freq(struct cpufreq_policy *policy,
char *buf)
{
int avg_freq = arch_freq_get_on_cpu(policy->cpu);
int ret;
if (avg_freq > 0)
ret = sysfs_emit(buf, "%u\n", avg_freq);
else if (cpufreq_driver->setpolicy && cpufreq_driver->get)
ret = sysfs_emit(buf, "%u\n", cpufreq_driver->get(policy->cpu));
else
ret = sysfs_emit(buf, "%u\n", policy->cur);
return ret;
}
But that also makes show_cpuinfo_avg_freq() pretty much the same as
show_scaling_cur_freq().
So is it possible to consolidate show_cpuinfo_avg_freq() into
show_scaling_cur_freq(), by making CONFIG_CPUFREQ_ARCH_CUR_FREQ also available
to ARM64?
Kai-Heng
> +
> + if (avg_freq > 0)
> + return sysfs_emit(buf, "%u\n", avg_freq);
> + return avg_freq != 0 ? avg_freq : -EINVAL;
> +}
> +
> /*
> * show_scaling_governor - show the current policy for the specified CPU
> */
> @@ -964,6 +985,7 @@ static ssize_t show_bios_limit(struct cpufreq_policy *policy, char *buf)
> }
>
> cpufreq_freq_attr_ro_perm(cpuinfo_cur_freq, 0400);
> +cpufreq_freq_attr_ro(cpuinfo_avg_freq);
> cpufreq_freq_attr_ro(cpuinfo_min_freq);
> cpufreq_freq_attr_ro(cpuinfo_max_freq);
> cpufreq_freq_attr_ro(cpuinfo_transition_latency);
> @@ -1091,6 +1113,12 @@ static int cpufreq_add_dev_interface(struct cpufreq_policy *policy)
> return ret;
> }
>
> + if (cpufreq_avg_freq_supported(policy)) {
> + ret = sysfs_create_file(&policy->kobj, &cpuinfo_avg_freq.attr);
> + if (ret)
> + return ret;
> + }
> +
> ret = sysfs_create_file(&policy->kobj, &scaling_cur_freq.attr);
> if (ret)
> return ret;
> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
> index d4d2f4d1d7cb..a7b6c0ccf9bc 100644
> --- a/include/linux/cpufreq.h
> +++ b/include/linux/cpufreq.h
> @@ -1194,7 +1194,7 @@ static inline int of_perf_domain_get_sharing_cpumask(int pcpu, const char *list_
> }
> #endif
>
> -extern unsigned int arch_freq_get_on_cpu(int cpu);
> +extern int arch_freq_get_on_cpu(int cpu);
>
> #ifndef arch_set_freq_scale
> static __always_inline
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v8 1/4] cpufreq: Introduce an optional cpuinfo_avg_freq sysfs entry
2024-12-16 5:43 ` Kai-Heng Feng
@ 2024-12-16 7:11 ` Sumit Gupta
2024-12-16 8:33 ` Kai-Heng Feng
2024-12-16 22:21 ` Beata Michalska
1 sibling, 1 reply; 13+ messages in thread
From: Sumit Gupta @ 2024-12-16 7:11 UTC (permalink / raw)
To: Kai-Heng Feng, Beata Michalska, linux-kernel, linux-arm-kernel,
linux-pm, ionela.voinescu, sudeep.holla, will, catalin.marinas,
rafael, viresh.kumar
Cc: yang, vanshikonda, lihuisong, zhanjie9, Jonathan Corbet,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
H . Peter Anvin, Phil Auld, x86, linux-doc, Carol Soto,
linux-tegra
On 16/12/24 11:13, Kai-Heng Feng wrote:
> Hi Beata,
>
> On 2024/12/6 9:55 PM, Beata Michalska wrote:
>> Currently the CPUFreq core exposes two sysfs attributes that can be used
>> to query current frequency of a given CPU(s): namely cpuinfo_cur_freq
>> and scaling_cur_freq. Both provide slightly different view on the
>> subject and they do come with their own drawbacks.
>>
>> cpuinfo_cur_freq provides higher precision though at a cost of being
>> rather expensive. Moreover, the information retrieved via this attribute
>> is somewhat short lived as frequency can change at any point of time
>> making it difficult to reason from.
>>
>> scaling_cur_freq, on the other hand, tends to be less accurate but then
>> the actual level of precision (and source of information) varies between
>> architectures making it a bit ambiguous.
>>
>> The new attribute, cpuinfo_avg_freq, is intended to provide more stable,
>> distinct interface, exposing an average frequency of a given CPU(s), as
>> reported by the hardware, over a time frame spanning no more than a few
>> milliseconds. As it requires appropriate hardware support, this
>> interface is optional.
>>
>> Note that under the hood, the new attribute relies on the information
>> provided by arch_freq_get_on_cpu, which, up to this point, has been
>> feeding data for scaling_cur_freq attribute, being the source of
>> ambiguity when it comes to interpretation. This has been amended by
>> restoring the intended behavior for scaling_cur_freq, with a new
>> dedicated config option to maintain status quo for those, who may need
>> it.
>>
>> CC: Jonathan Corbet <corbet@lwn.net>
>> CC: Thomas Gleixner <tglx@linutronix.de>
>> CC: Ingo Molnar <mingo@redhat.com>
>> CC: Borislav Petkov <bp@alien8.de>
>> CC: Dave Hansen <dave.hansen@linux.intel.com>
>> CC: H. Peter Anvin <hpa@zytor.com>
>> CC: Phil Auld <pauld@redhat.com>
>> CC: x86@kernel.org
>> CC: linux-doc@vger.kernel.org
>> Signed-off-by: Beata Michalska <beata.michalska@arm.com>
>> ---
>> Documentation/admin-guide/pm/cpufreq.rst | 16 ++++++++++-
>> arch/x86/kernel/cpu/aperfmperf.c | 2 +-
>> arch/x86/kernel/cpu/proc.c | 7 +++--
>> drivers/cpufreq/Kconfig.x86 | 12 ++++++++
>> drivers/cpufreq/cpufreq.c | 36 +++++++++++++++++++++---
>> include/linux/cpufreq.h | 2 +-
>> 6 files changed, 66 insertions(+), 9 deletions(-)
>>
>> diff --git a/Documentation/admin-guide/pm/cpufreq.rst
>> b/Documentation/admin-guide/pm/cpufreq.rst
>> index fe1be4ad88cb..76f3835afe01 100644
>> --- a/Documentation/admin-guide/pm/cpufreq.rst
>> +++ b/Documentation/admin-guide/pm/cpufreq.rst
>> @@ -248,6 +248,19 @@ are the following:
>> If that frequency cannot be determined, this attribute should not
>> be present.
>> +``cpuinfo_avg_freq``
>> + An average frequency (in KHz) of all CPUs belonging to a
>> given policy,
>> + derived from a hardware provided feedback and reported on a
>> time frame
>> + spanning at most few milliseconds.
>> +
>> + This is expected to be based on the frequency the hardware
>> actually runs
>> + at and, as such, might require specialised hardware support
>> (such as AMU
>> + extension on ARM). If one cannot be determined, this
>> attribute should
>> + not be present.
>> +
>> + Note, that failed attempt to retrieve current frequency for a
>> given
>> + CPU(s) will result in an appropriate error.
>> +
>> ``cpuinfo_max_freq``
>> Maximum possible operating frequency the CPUs belonging to this
>> policy
>> can run at (in kHz).
>> @@ -293,7 +306,8 @@ are the following:
>> Some architectures (e.g. ``x86``) may attempt to provide
>> information
>> more precisely reflecting the current CPU frequency through this
>> attribute, but that still may not be the exact current CPU
>> frequency as
>> - seen by the hardware at the moment.
>> + seen by the hardware at the moment. This behavior though, is only
>> + available via c:macro:``CPUFREQ_ARCH_CUR_FREQ`` option.
>> ``scaling_driver``
>> The scaling driver currently in use.
>> diff --git a/arch/x86/kernel/cpu/aperfmperf.c
>> b/arch/x86/kernel/cpu/aperfmperf.c
>> index 0b69bfbf345d..a00059139ca4 100644
>> --- a/arch/x86/kernel/cpu/aperfmperf.c
>> +++ b/arch/x86/kernel/cpu/aperfmperf.c
>> @@ -413,7 +413,7 @@ void arch_scale_freq_tick(void)
>> */
>> #define MAX_SAMPLE_AGE ((unsigned long)HZ / 50)
>> -unsigned int arch_freq_get_on_cpu(int cpu)
>> +int arch_freq_get_on_cpu(int cpu)
>> {
>> struct aperfmperf *s = per_cpu_ptr(&cpu_samples, cpu);
>> unsigned int seq, freq;
>> diff --git a/arch/x86/kernel/cpu/proc.c b/arch/x86/kernel/cpu/proc.c
>> index e65fae63660e..34d8fb93fb70 100644
>> --- a/arch/x86/kernel/cpu/proc.c
>> +++ b/arch/x86/kernel/cpu/proc.c
>> @@ -86,9 +86,12 @@ static int show_cpuinfo(struct seq_file *m, void *v)
>> seq_printf(m, "microcode\t: 0x%x\n", c->microcode);
>> if (cpu_has(c, X86_FEATURE_TSC)) {
>> - unsigned int freq = arch_freq_get_on_cpu(cpu);
>> + int freq = arch_freq_get_on_cpu(cpu);
>> - seq_printf(m, "cpu MHz\t\t: %u.%03u\n", freq / 1000, (freq %
>> 1000));
>> + if (freq <= 0)
>> + seq_puts(m, "cpu MHz\t\t: Unknown\n");
>> + else
>> + seq_printf(m, "cpu MHz\t\t: %u.%03u\n", freq / 1000,
>> (freq % 1000));
>> }
>> /* Cache size */
>> diff --git a/drivers/cpufreq/Kconfig.x86 b/drivers/cpufreq/Kconfig.x86
>> index 97c2d4f15d76..212e1b9afe21 100644
>> --- a/drivers/cpufreq/Kconfig.x86
>> +++ b/drivers/cpufreq/Kconfig.x86
>> @@ -340,3 +340,15 @@ config X86_SPEEDSTEP_RELAXED_CAP_CHECK
>> option lets the probing code bypass some of those checks if the
>> parameter "relaxed_check=1" is passed to the module.
>> +config CPUFREQ_ARCH_CUR_FREQ
>> + default y
>> + bool "Current frequency derived from HW provided feedback"
>> + help
>> + This determines whether the scaling_cur_freq sysfs attribute
>> returns
>> + the last requested frequency or a more precise value based on
>> hardware
>> + provided feedback (as architected counters).
>> + Given that a more precise frequency can now be provided via the
>> + cpuinfo_avg_cur_freq attribute, by enabling this option,
>> + scaling_cur_freq maintains the provision of a counter based
>> frequency,
>> + for compatibility reasons.
>> +
>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>> index 04fc786dd2c0..70df2a24437b 100644
>> --- a/drivers/cpufreq/cpufreq.c
>> +++ b/drivers/cpufreq/cpufreq.c
>> @@ -747,9 +747,14 @@ show_one(cpuinfo_transition_latency,
>> cpuinfo.transition_latency);
>> show_one(scaling_min_freq, min);
>> show_one(scaling_max_freq, max);
>> -__weak unsigned int arch_freq_get_on_cpu(int cpu)
>> +__weak int arch_freq_get_on_cpu(int cpu)
>> {
>> - return 0;
>> + return -EOPNOTSUPP;
>> +}
>> +
>> +static inline bool cpufreq_avg_freq_supported(struct cpufreq_policy
>> *policy)
>> +{
>> + return arch_freq_get_on_cpu(policy->cpu) != -EOPNOTSUPP;
>> }
>> static ssize_t show_scaling_cur_freq(struct cpufreq_policy *policy,
>> char *buf)
>> @@ -757,8 +762,11 @@ static ssize_t show_scaling_cur_freq(struct
>> cpufreq_policy *policy, char *buf)
>> ssize_t ret;
>> unsigned int freq;
>> - freq = arch_freq_get_on_cpu(policy->cpu);
>> - if (freq)
>> + freq = IS_ENABLED(CONFIG_CPUFREQ_ARCH_CUR_FREQ)
>> + ? arch_freq_get_on_cpu(policy->cpu)
>> + : 0;
>> +
>> + if (freq > 0)
>> ret = sysfs_emit(buf, "%u\n", freq);
>> else if (cpufreq_driver->setpolicy && cpufreq_driver->get)
>> ret = sysfs_emit(buf, "%u\n",
>> cpufreq_driver->get(policy->cpu));
>> @@ -802,6 +810,19 @@ static ssize_t show_cpuinfo_cur_freq(struct
>> cpufreq_policy *policy,
>> return sysfs_emit(buf, "<unknown>\n");
>> }
>> +/*
>> + * show_cpuinfo_avg_freq - average CPU frequency as detected by hardware
>> + */
>> +static ssize_t show_cpuinfo_avg_freq(struct cpufreq_policy *policy,
>> + char *buf)
>> +{
>> + int avg_freq = arch_freq_get_on_cpu(policy->cpu);
>
> We are seeing issues when reading cpuinfo_avg_freq on an ARM64 system:
>
> $ cat /sys/devices/system/cpu/cpufreq/policy1/cpuinfo_avg_freq
> cat: /sys/devices/system/cpu/cpufreq/policy1/cpuinfo_avg_freq: Resource
> temporarily unavailable
>
> The CPU is in idle state, so arch_freq_get_on_cpu() can't find a good
> alternative source for frequency info.
>
Hi Kai Heng,
This has already been discussed during v7 in [1] & [2].
In v7, we were returning zero which printed 'unknown'.
The discussion was about printing in more descriptive way or with an
appropriate error code. In v8 we are returning 'EAGAIN' instead of zero.
The final decision was of Maintainers.
Viresh,
You have any preference on this?
[1]
https://lore.kernel.org/lkml/aa254516-968e-4665-bb5b-981c296ffc35@nvidia.com/#t
[2] https://lore.kernel.org/lkml/Zyh-uVSW-0d0r8oB@arm.com/
Thank you,
Sumit Gupta
> One way to resolve this is to have fallback methods in
> show_cpuinfo_avg_freq() so it will look like this:
>
> static ssize_t show_cpuinfo_avg_freq(struct cpufreq_policy *policy,
> char *buf)
> {
> int avg_freq = arch_freq_get_on_cpu(policy->cpu);
> int ret;
>
> if (avg_freq > 0)
> ret = sysfs_emit(buf, "%u\n", avg_freq);
> else if (cpufreq_driver->setpolicy && cpufreq_driver->get)
> ret = sysfs_emit(buf, "%u\n",
> cpufreq_driver->get(policy->cpu));
> else
> ret = sysfs_emit(buf, "%u\n", policy->cur);
> return ret;
> }
>
> But that also makes show_cpuinfo_avg_freq() pretty much the same as
> show_scaling_cur_freq().
>
> So is it possible to consolidate show_cpuinfo_avg_freq() into
> show_scaling_cur_freq(), by making CONFIG_CPUFREQ_ARCH_CUR_FREQ also
> available to ARM64?
>
> Kai-Heng
>
>> +
>> + if (avg_freq > 0)
>> + return sysfs_emit(buf, "%u\n", avg_freq);
>> + return avg_freq != 0 ? avg_freq : -EINVAL;
>> +}
>> +
>> /*
>> * show_scaling_governor - show the current policy for the specified
>> CPU
>> */
>> @@ -964,6 +985,7 @@ static ssize_t show_bios_limit(struct
>> cpufreq_policy *policy, char *buf)
>> }
>> cpufreq_freq_attr_ro_perm(cpuinfo_cur_freq, 0400);
>> +cpufreq_freq_attr_ro(cpuinfo_avg_freq);
>> cpufreq_freq_attr_ro(cpuinfo_min_freq);
>> cpufreq_freq_attr_ro(cpuinfo_max_freq);
>> cpufreq_freq_attr_ro(cpuinfo_transition_latency);
>> @@ -1091,6 +1113,12 @@ static int cpufreq_add_dev_interface(struct
>> cpufreq_policy *policy)
>> return ret;
>> }
>> + if (cpufreq_avg_freq_supported(policy)) {
>> + ret = sysfs_create_file(&policy->kobj, &cpuinfo_avg_freq.attr);
>> + if (ret)
>> + return ret;
>> + }
>> +
>> ret = sysfs_create_file(&policy->kobj, &scaling_cur_freq.attr);
>> if (ret)
>> return ret;
>> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
>> index d4d2f4d1d7cb..a7b6c0ccf9bc 100644
>> --- a/include/linux/cpufreq.h
>> +++ b/include/linux/cpufreq.h
>> @@ -1194,7 +1194,7 @@ static inline int
>> of_perf_domain_get_sharing_cpumask(int pcpu, const char *list_
>> }
>> #endif
>> -extern unsigned int arch_freq_get_on_cpu(int cpu);
>> +extern int arch_freq_get_on_cpu(int cpu);
>> #ifndef arch_set_freq_scale
>> static __always_inline
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v8 1/4] cpufreq: Introduce an optional cpuinfo_avg_freq sysfs entry
2024-12-16 7:11 ` Sumit Gupta
@ 2024-12-16 8:33 ` Kai-Heng Feng
2024-12-16 22:32 ` Beata Michalska
0 siblings, 1 reply; 13+ messages in thread
From: Kai-Heng Feng @ 2024-12-16 8:33 UTC (permalink / raw)
To: Sumit Gupta, Beata Michalska, linux-kernel, linux-arm-kernel,
linux-pm, ionela.voinescu, sudeep.holla, will, catalin.marinas,
rafael, viresh.kumar
Cc: yang, vanshikonda, lihuisong, zhanjie9, Jonathan Corbet,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
H . Peter Anvin, Phil Auld, x86, linux-doc, Carol Soto,
linux-tegra
Hi Sumit,
On 2024/12/16 3:11 PM, Sumit Gupta wrote:
>
>
> On 16/12/24 11:13, Kai-Heng Feng wrote:
>> Hi Beata,
>>
>> On 2024/12/6 9:55 PM, Beata Michalska wrote:
>>> Currently the CPUFreq core exposes two sysfs attributes that can be used
>>> to query current frequency of a given CPU(s): namely cpuinfo_cur_freq
>>> and scaling_cur_freq. Both provide slightly different view on the
>>> subject and they do come with their own drawbacks.
>>>
>>> cpuinfo_cur_freq provides higher precision though at a cost of being
>>> rather expensive. Moreover, the information retrieved via this attribute
>>> is somewhat short lived as frequency can change at any point of time
>>> making it difficult to reason from.
>>>
>>> scaling_cur_freq, on the other hand, tends to be less accurate but then
>>> the actual level of precision (and source of information) varies between
>>> architectures making it a bit ambiguous.
>>>
>>> The new attribute, cpuinfo_avg_freq, is intended to provide more stable,
>>> distinct interface, exposing an average frequency of a given CPU(s), as
>>> reported by the hardware, over a time frame spanning no more than a few
>>> milliseconds. As it requires appropriate hardware support, this
>>> interface is optional.
>>>
>>> Note that under the hood, the new attribute relies on the information
>>> provided by arch_freq_get_on_cpu, which, up to this point, has been
>>> feeding data for scaling_cur_freq attribute, being the source of
>>> ambiguity when it comes to interpretation. This has been amended by
>>> restoring the intended behavior for scaling_cur_freq, with a new
>>> dedicated config option to maintain status quo for those, who may need
>>> it.
>>>
>>> CC: Jonathan Corbet <corbet@lwn.net>
>>> CC: Thomas Gleixner <tglx@linutronix.de>
>>> CC: Ingo Molnar <mingo@redhat.com>
>>> CC: Borislav Petkov <bp@alien8.de>
>>> CC: Dave Hansen <dave.hansen@linux.intel.com>
>>> CC: H. Peter Anvin <hpa@zytor.com>
>>> CC: Phil Auld <pauld@redhat.com>
>>> CC: x86@kernel.org
>>> CC: linux-doc@vger.kernel.org
>>> Signed-off-by: Beata Michalska <beata.michalska@arm.com>
>>> ---
>>> Documentation/admin-guide/pm/cpufreq.rst | 16 ++++++++++-
>>> arch/x86/kernel/cpu/aperfmperf.c | 2 +-
>>> arch/x86/kernel/cpu/proc.c | 7 +++--
>>> drivers/cpufreq/Kconfig.x86 | 12 ++++++++
>>> drivers/cpufreq/cpufreq.c | 36 +++++++++++++++++++++---
>>> include/linux/cpufreq.h | 2 +-
>>> 6 files changed, 66 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/Documentation/admin-guide/pm/cpufreq.rst b/Documentation/admin-
>>> guide/pm/cpufreq.rst
>>> index fe1be4ad88cb..76f3835afe01 100644
>>> --- a/Documentation/admin-guide/pm/cpufreq.rst
>>> +++ b/Documentation/admin-guide/pm/cpufreq.rst
>>> @@ -248,6 +248,19 @@ are the following:
>>> If that frequency cannot be determined, this attribute should not
>>> be present.
>>> +``cpuinfo_avg_freq``
>>> + An average frequency (in KHz) of all CPUs belonging to a given policy,
>>> + derived from a hardware provided feedback and reported on a time frame
>>> + spanning at most few milliseconds.
>>> +
>>> + This is expected to be based on the frequency the hardware actually
>>> runs
>>> + at and, as such, might require specialised hardware support (such as
>>> AMU
>>> + extension on ARM). If one cannot be determined, this attribute should
>>> + not be present.
>>> +
>>> + Note, that failed attempt to retrieve current frequency for a given
>>> + CPU(s) will result in an appropriate error.
>>> +
>>> ``cpuinfo_max_freq``
>>> Maximum possible operating frequency the CPUs belonging to this policy
>>> can run at (in kHz).
>>> @@ -293,7 +306,8 @@ are the following:
>>> Some architectures (e.g. ``x86``) may attempt to provide information
>>> more precisely reflecting the current CPU frequency through this
>>> attribute, but that still may not be the exact current CPU frequency as
>>> - seen by the hardware at the moment.
>>> + seen by the hardware at the moment. This behavior though, is only
>>> + available via c:macro:``CPUFREQ_ARCH_CUR_FREQ`` option.
>>> ``scaling_driver``
>>> The scaling driver currently in use.
>>> diff --git a/arch/x86/kernel/cpu/aperfmperf.c b/arch/x86/kernel/cpu/aperfmperf.c
>>> index 0b69bfbf345d..a00059139ca4 100644
>>> --- a/arch/x86/kernel/cpu/aperfmperf.c
>>> +++ b/arch/x86/kernel/cpu/aperfmperf.c
>>> @@ -413,7 +413,7 @@ void arch_scale_freq_tick(void)
>>> */
>>> #define MAX_SAMPLE_AGE ((unsigned long)HZ / 50)
>>> -unsigned int arch_freq_get_on_cpu(int cpu)
>>> +int arch_freq_get_on_cpu(int cpu)
>>> {
>>> struct aperfmperf *s = per_cpu_ptr(&cpu_samples, cpu);
>>> unsigned int seq, freq;
>>> diff --git a/arch/x86/kernel/cpu/proc.c b/arch/x86/kernel/cpu/proc.c
>>> index e65fae63660e..34d8fb93fb70 100644
>>> --- a/arch/x86/kernel/cpu/proc.c
>>> +++ b/arch/x86/kernel/cpu/proc.c
>>> @@ -86,9 +86,12 @@ static int show_cpuinfo(struct seq_file *m, void *v)
>>> seq_printf(m, "microcode\t: 0x%x\n", c->microcode);
>>> if (cpu_has(c, X86_FEATURE_TSC)) {
>>> - unsigned int freq = arch_freq_get_on_cpu(cpu);
>>> + int freq = arch_freq_get_on_cpu(cpu);
>>> - seq_printf(m, "cpu MHz\t\t: %u.%03u\n", freq / 1000, (freq % 1000));
>>> + if (freq <= 0)
>>> + seq_puts(m, "cpu MHz\t\t: Unknown\n");
>>> + else
>>> + seq_printf(m, "cpu MHz\t\t: %u.%03u\n", freq / 1000, (freq %
>>> 1000));
>>> }
>>> /* Cache size */
>>> diff --git a/drivers/cpufreq/Kconfig.x86 b/drivers/cpufreq/Kconfig.x86
>>> index 97c2d4f15d76..212e1b9afe21 100644
>>> --- a/drivers/cpufreq/Kconfig.x86
>>> +++ b/drivers/cpufreq/Kconfig.x86
>>> @@ -340,3 +340,15 @@ config X86_SPEEDSTEP_RELAXED_CAP_CHECK
>>> option lets the probing code bypass some of those checks if the
>>> parameter "relaxed_check=1" is passed to the module.
>>> +config CPUFREQ_ARCH_CUR_FREQ
>>> + default y
>>> + bool "Current frequency derived from HW provided feedback"
>>> + help
>>> + This determines whether the scaling_cur_freq sysfs attribute returns
>>> + the last requested frequency or a more precise value based on hardware
>>> + provided feedback (as architected counters).
>>> + Given that a more precise frequency can now be provided via the
>>> + cpuinfo_avg_cur_freq attribute, by enabling this option,
>>> + scaling_cur_freq maintains the provision of a counter based frequency,
>>> + for compatibility reasons.
>>> +
>>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>>> index 04fc786dd2c0..70df2a24437b 100644
>>> --- a/drivers/cpufreq/cpufreq.c
>>> +++ b/drivers/cpufreq/cpufreq.c
>>> @@ -747,9 +747,14 @@ show_one(cpuinfo_transition_latency,
>>> cpuinfo.transition_latency);
>>> show_one(scaling_min_freq, min);
>>> show_one(scaling_max_freq, max);
>>> -__weak unsigned int arch_freq_get_on_cpu(int cpu)
>>> +__weak int arch_freq_get_on_cpu(int cpu)
>>> {
>>> - return 0;
>>> + return -EOPNOTSUPP;
>>> +}
>>> +
>>> +static inline bool cpufreq_avg_freq_supported(struct cpufreq_policy *policy)
>>> +{
>>> + return arch_freq_get_on_cpu(policy->cpu) != -EOPNOTSUPP;
>>> }
>>> static ssize_t show_scaling_cur_freq(struct cpufreq_policy *policy, char *buf)
>>> @@ -757,8 +762,11 @@ static ssize_t show_scaling_cur_freq(struct
>>> cpufreq_policy *policy, char *buf)
>>> ssize_t ret;
>>> unsigned int freq;
>>> - freq = arch_freq_get_on_cpu(policy->cpu);
>>> - if (freq)
>>> + freq = IS_ENABLED(CONFIG_CPUFREQ_ARCH_CUR_FREQ)
>>> + ? arch_freq_get_on_cpu(policy->cpu)
>>> + : 0;
>>> +
>>> + if (freq > 0)
>>> ret = sysfs_emit(buf, "%u\n", freq);
>>> else if (cpufreq_driver->setpolicy && cpufreq_driver->get)
>>> ret = sysfs_emit(buf, "%u\n", cpufreq_driver->get(policy->cpu));
>>> @@ -802,6 +810,19 @@ static ssize_t show_cpuinfo_cur_freq(struct
>>> cpufreq_policy *policy,
>>> return sysfs_emit(buf, "<unknown>\n");
>>> }
>>> +/*
>>> + * show_cpuinfo_avg_freq - average CPU frequency as detected by hardware
>>> + */
>>> +static ssize_t show_cpuinfo_avg_freq(struct cpufreq_policy *policy,
>>> + char *buf)
>>> +{
>>> + int avg_freq = arch_freq_get_on_cpu(policy->cpu);
>>
>> We are seeing issues when reading cpuinfo_avg_freq on an ARM64 system:
>>
>> $ cat /sys/devices/system/cpu/cpufreq/policy1/cpuinfo_avg_freq
>> cat: /sys/devices/system/cpu/cpufreq/policy1/cpuinfo_avg_freq: Resource
>> temporarily unavailable
>>
>> The CPU is in idle state, so arch_freq_get_on_cpu() can't find a good
>> alternative source for frequency info.
>>
>
> Hi Kai Heng,
> This has already been discussed during v7 in [1] & [2].
Thanks for the info!
> In v7, we were returning zero which printed 'unknown'.
> The discussion was about printing in more descriptive way or with an appropriate
> error code. In v8 we are returning 'EAGAIN' instead of zero. The final decision
> was of Maintainers.
Is there any cpufreq driver that prints "unknown" or error when CPU is in idle?
I think it's more unsurprising to print the lowest CPU frequency when CPU is in
idle state, instead of any other error code.
Kai-Heng
>
> Viresh,
> You have any preference on this?
>
> [1] https://lore.kernel.org/lkml/aa254516-968e-4665-bb5b-981c296ffc35@nvidia.com/#t
> [2] https://lore.kernel.org/lkml/Zyh-uVSW-0d0r8oB@arm.com/
>
> Thank you,
> Sumit Gupta
>
>> One way to resolve this is to have fallback methods in show_cpuinfo_avg_freq()
>> so it will look like this:
>>
>> static ssize_t show_cpuinfo_avg_freq(struct cpufreq_policy *policy,
>> char *buf)
>> {
>> int avg_freq = arch_freq_get_on_cpu(policy->cpu);
>> int ret;
>>
>> if (avg_freq > 0)
>> ret = sysfs_emit(buf, "%u\n", avg_freq);
>> else if (cpufreq_driver->setpolicy && cpufreq_driver->get)
>> ret = sysfs_emit(buf, "%u\n", cpufreq_driver->get(policy->cpu));
>> else
>> ret = sysfs_emit(buf, "%u\n", policy->cur);
>> return ret;
>> }
>>
>> But that also makes show_cpuinfo_avg_freq() pretty much the same as
>> show_scaling_cur_freq().
>>
>> So is it possible to consolidate show_cpuinfo_avg_freq() into
>> show_scaling_cur_freq(), by making CONFIG_CPUFREQ_ARCH_CUR_FREQ also available
>> to ARM64?
>>
>> Kai-Heng
>>
>>> +
>>> + if (avg_freq > 0)
>>> + return sysfs_emit(buf, "%u\n", avg_freq);
>>> + return avg_freq != 0 ? avg_freq : -EINVAL;
>>> +}
>>> +
>>> /*
>>> * show_scaling_governor - show the current policy for the specified CPU
>>> */
>>> @@ -964,6 +985,7 @@ static ssize_t show_bios_limit(struct cpufreq_policy
>>> *policy, char *buf)
>>> }
>>> cpufreq_freq_attr_ro_perm(cpuinfo_cur_freq, 0400);
>>> +cpufreq_freq_attr_ro(cpuinfo_avg_freq);
>>> cpufreq_freq_attr_ro(cpuinfo_min_freq);
>>> cpufreq_freq_attr_ro(cpuinfo_max_freq);
>>> cpufreq_freq_attr_ro(cpuinfo_transition_latency);
>>> @@ -1091,6 +1113,12 @@ static int cpufreq_add_dev_interface(struct
>>> cpufreq_policy *policy)
>>> return ret;
>>> }
>>> + if (cpufreq_avg_freq_supported(policy)) {
>>> + ret = sysfs_create_file(&policy->kobj, &cpuinfo_avg_freq.attr);
>>> + if (ret)
>>> + return ret;
>>> + }
>>> +
>>> ret = sysfs_create_file(&policy->kobj, &scaling_cur_freq.attr);
>>> if (ret)
>>> return ret;
>>> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
>>> index d4d2f4d1d7cb..a7b6c0ccf9bc 100644
>>> --- a/include/linux/cpufreq.h
>>> +++ b/include/linux/cpufreq.h
>>> @@ -1194,7 +1194,7 @@ static inline int
>>> of_perf_domain_get_sharing_cpumask(int pcpu, const char *list_
>>> }
>>> #endif
>>> -extern unsigned int arch_freq_get_on_cpu(int cpu);
>>> +extern int arch_freq_get_on_cpu(int cpu);
>>> #ifndef arch_set_freq_scale
>>> static __always_inline
>>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v8 1/4] cpufreq: Introduce an optional cpuinfo_avg_freq sysfs entry
2024-12-12 6:51 ` Viresh Kumar
@ 2024-12-16 22:15 ` Beata Michalska
2024-12-17 4:27 ` Viresh Kumar
0 siblings, 1 reply; 13+ messages in thread
From: Beata Michalska @ 2024-12-16 22:15 UTC (permalink / raw)
To: Viresh Kumar
Cc: linux-kernel, linux-arm-kernel, linux-pm, ionela.voinescu,
sudeep.holla, will, catalin.marinas, rafael, sumitg, yang,
vanshikonda, lihuisong, zhanjie9, Jonathan Corbet,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
H . Peter Anvin, Phil Auld, x86, linux-doc
On Thu, Dec 12, 2024 at 12:21:00PM +0530, Viresh Kumar wrote:
> On 06-12-24, 13:55, Beata Michalska wrote:
> > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> > index 04fc786dd2c0..70df2a24437b 100644
> > --- a/drivers/cpufreq/cpufreq.c
> > +++ b/drivers/cpufreq/cpufreq.c
> > @@ -747,9 +747,14 @@ show_one(cpuinfo_transition_latency, cpuinfo.transition_latency);
> > show_one(scaling_min_freq, min);
> > show_one(scaling_max_freq, max);
> >
> > -__weak unsigned int arch_freq_get_on_cpu(int cpu)
> > +__weak int arch_freq_get_on_cpu(int cpu)
> > {
> > - return 0;
> > + return -EOPNOTSUPP;
>
> I did suggest not doing this as it may not be acceptable.
>
> https://lore.kernel.org/all/CAKohpokFUpQyHYO017kOn-Jbt0CFZ1GuxoG3N-fenWJ_poW=4Q@mail.gmail.com/
>
My bad as I must have misinterpreted that message. Although I am not entirely
sure why this might be unacceptable as it is not such uncommon approach to use
signed int space to cover both: expected positive value as well as potential
error code case failure.
Enabling the new attribute for all is an option, tough not entirely compelling
one as exposing a feature that is known not to be supported seems bit
counterintuitive. On the other hand using cpufreq driver flags won't help much
as the support for the new attrib is platform-specific, not driver-specific.
---
BR
Beata
> --
> viresh
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v8 1/4] cpufreq: Introduce an optional cpuinfo_avg_freq sysfs entry
2024-12-16 5:43 ` Kai-Heng Feng
2024-12-16 7:11 ` Sumit Gupta
@ 2024-12-16 22:21 ` Beata Michalska
1 sibling, 0 replies; 13+ messages in thread
From: Beata Michalska @ 2024-12-16 22:21 UTC (permalink / raw)
To: Kai-Heng Feng
Cc: linux-kernel, linux-arm-kernel, linux-pm, ionela.voinescu,
sudeep.holla, will, catalin.marinas, rafael, viresh.kumar, sumitg,
yang, vanshikonda, lihuisong, zhanjie9, Jonathan Corbet,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
H . Peter Anvin, Phil Auld, x86, linux-doc, Carol Soto
On Mon, Dec 16, 2024 at 01:43:02PM +0800, Kai-Heng Feng wrote:
> Hi Beata,
>
> On 2024/12/6 9:55 PM, Beata Michalska wrote:
> > Currently the CPUFreq core exposes two sysfs attributes that can be used
> > to query current frequency of a given CPU(s): namely cpuinfo_cur_freq
> > and scaling_cur_freq. Both provide slightly different view on the
> > subject and they do come with their own drawbacks.
> >
> > cpuinfo_cur_freq provides higher precision though at a cost of being
> > rather expensive. Moreover, the information retrieved via this attribute
> > is somewhat short lived as frequency can change at any point of time
> > making it difficult to reason from.
> >
> > scaling_cur_freq, on the other hand, tends to be less accurate but then
> > the actual level of precision (and source of information) varies between
> > architectures making it a bit ambiguous.
> >
> > The new attribute, cpuinfo_avg_freq, is intended to provide more stable,
> > distinct interface, exposing an average frequency of a given CPU(s), as
> > reported by the hardware, over a time frame spanning no more than a few
> > milliseconds. As it requires appropriate hardware support, this
> > interface is optional.
> >
> > Note that under the hood, the new attribute relies on the information
> > provided by arch_freq_get_on_cpu, which, up to this point, has been
> > feeding data for scaling_cur_freq attribute, being the source of
> > ambiguity when it comes to interpretation. This has been amended by
> > restoring the intended behavior for scaling_cur_freq, with a new
> > dedicated config option to maintain status quo for those, who may need
> > it.
> >
> > CC: Jonathan Corbet <corbet@lwn.net>
> > CC: Thomas Gleixner <tglx@linutronix.de>
> > CC: Ingo Molnar <mingo@redhat.com>
> > CC: Borislav Petkov <bp@alien8.de>
> > CC: Dave Hansen <dave.hansen@linux.intel.com>
> > CC: H. Peter Anvin <hpa@zytor.com>
> > CC: Phil Auld <pauld@redhat.com>
> > CC: x86@kernel.org
> > CC: linux-doc@vger.kernel.org
> > Signed-off-by: Beata Michalska <beata.michalska@arm.com>
> > ---
> > Documentation/admin-guide/pm/cpufreq.rst | 16 ++++++++++-
> > arch/x86/kernel/cpu/aperfmperf.c | 2 +-
> > arch/x86/kernel/cpu/proc.c | 7 +++--
> > drivers/cpufreq/Kconfig.x86 | 12 ++++++++
> > drivers/cpufreq/cpufreq.c | 36 +++++++++++++++++++++---
> > include/linux/cpufreq.h | 2 +-
> > 6 files changed, 66 insertions(+), 9 deletions(-)
> >
> > diff --git a/Documentation/admin-guide/pm/cpufreq.rst b/Documentation/admin-guide/pm/cpufreq.rst
> > index fe1be4ad88cb..76f3835afe01 100644
> > --- a/Documentation/admin-guide/pm/cpufreq.rst
> > +++ b/Documentation/admin-guide/pm/cpufreq.rst
> > @@ -248,6 +248,19 @@ are the following:
> > If that frequency cannot be determined, this attribute should not
> > be present.
> > +``cpuinfo_avg_freq``
> > + An average frequency (in KHz) of all CPUs belonging to a given policy,
> > + derived from a hardware provided feedback and reported on a time frame
> > + spanning at most few milliseconds.
> > +
> > + This is expected to be based on the frequency the hardware actually runs
> > + at and, as such, might require specialised hardware support (such as AMU
> > + extension on ARM). If one cannot be determined, this attribute should
> > + not be present.
> > +
> > + Note, that failed attempt to retrieve current frequency for a given
> > + CPU(s) will result in an appropriate error.
> > +
> > ``cpuinfo_max_freq``
> > Maximum possible operating frequency the CPUs belonging to this policy
> > can run at (in kHz).
> > @@ -293,7 +306,8 @@ are the following:
> > Some architectures (e.g. ``x86``) may attempt to provide information
> > more precisely reflecting the current CPU frequency through this
> > attribute, but that still may not be the exact current CPU frequency as
> > - seen by the hardware at the moment.
> > + seen by the hardware at the moment. This behavior though, is only
> > + available via c:macro:``CPUFREQ_ARCH_CUR_FREQ`` option.
> > ``scaling_driver``
> > The scaling driver currently in use.
> > diff --git a/arch/x86/kernel/cpu/aperfmperf.c b/arch/x86/kernel/cpu/aperfmperf.c
> > index 0b69bfbf345d..a00059139ca4 100644
> > --- a/arch/x86/kernel/cpu/aperfmperf.c
> > +++ b/arch/x86/kernel/cpu/aperfmperf.c
> > @@ -413,7 +413,7 @@ void arch_scale_freq_tick(void)
> > */
> > #define MAX_SAMPLE_AGE ((unsigned long)HZ / 50)
> > -unsigned int arch_freq_get_on_cpu(int cpu)
> > +int arch_freq_get_on_cpu(int cpu)
> > {
> > struct aperfmperf *s = per_cpu_ptr(&cpu_samples, cpu);
> > unsigned int seq, freq;
> > diff --git a/arch/x86/kernel/cpu/proc.c b/arch/x86/kernel/cpu/proc.c
> > index e65fae63660e..34d8fb93fb70 100644
> > --- a/arch/x86/kernel/cpu/proc.c
> > +++ b/arch/x86/kernel/cpu/proc.c
> > @@ -86,9 +86,12 @@ static int show_cpuinfo(struct seq_file *m, void *v)
> > seq_printf(m, "microcode\t: 0x%x\n", c->microcode);
> > if (cpu_has(c, X86_FEATURE_TSC)) {
> > - unsigned int freq = arch_freq_get_on_cpu(cpu);
> > + int freq = arch_freq_get_on_cpu(cpu);
> > - seq_printf(m, "cpu MHz\t\t: %u.%03u\n", freq / 1000, (freq % 1000));
> > + if (freq <= 0)
> > + seq_puts(m, "cpu MHz\t\t: Unknown\n");
> > + else
> > + seq_printf(m, "cpu MHz\t\t: %u.%03u\n", freq / 1000, (freq % 1000));
> > }
> > /* Cache size */
> > diff --git a/drivers/cpufreq/Kconfig.x86 b/drivers/cpufreq/Kconfig.x86
> > index 97c2d4f15d76..212e1b9afe21 100644
> > --- a/drivers/cpufreq/Kconfig.x86
> > +++ b/drivers/cpufreq/Kconfig.x86
> > @@ -340,3 +340,15 @@ config X86_SPEEDSTEP_RELAXED_CAP_CHECK
> > option lets the probing code bypass some of those checks if the
> > parameter "relaxed_check=1" is passed to the module.
> > +config CPUFREQ_ARCH_CUR_FREQ
> > + default y
> > + bool "Current frequency derived from HW provided feedback"
> > + help
> > + This determines whether the scaling_cur_freq sysfs attribute returns
> > + the last requested frequency or a more precise value based on hardware
> > + provided feedback (as architected counters).
> > + Given that a more precise frequency can now be provided via the
> > + cpuinfo_avg_cur_freq attribute, by enabling this option,
> > + scaling_cur_freq maintains the provision of a counter based frequency,
> > + for compatibility reasons.
> > +
> > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> > index 04fc786dd2c0..70df2a24437b 100644
> > --- a/drivers/cpufreq/cpufreq.c
> > +++ b/drivers/cpufreq/cpufreq.c
> > @@ -747,9 +747,14 @@ show_one(cpuinfo_transition_latency, cpuinfo.transition_latency);
> > show_one(scaling_min_freq, min);
> > show_one(scaling_max_freq, max);
> > -__weak unsigned int arch_freq_get_on_cpu(int cpu)
> > +__weak int arch_freq_get_on_cpu(int cpu)
> > {
> > - return 0;
> > + return -EOPNOTSUPP;
> > +}
> > +
> > +static inline bool cpufreq_avg_freq_supported(struct cpufreq_policy *policy)
> > +{
> > + return arch_freq_get_on_cpu(policy->cpu) != -EOPNOTSUPP;
> > }
> > static ssize_t show_scaling_cur_freq(struct cpufreq_policy *policy, char *buf)
> > @@ -757,8 +762,11 @@ static ssize_t show_scaling_cur_freq(struct cpufreq_policy *policy, char *buf)
> > ssize_t ret;
> > unsigned int freq;
> > - freq = arch_freq_get_on_cpu(policy->cpu);
> > - if (freq)
> > + freq = IS_ENABLED(CONFIG_CPUFREQ_ARCH_CUR_FREQ)
> > + ? arch_freq_get_on_cpu(policy->cpu)
> > + : 0;
> > +
> > + if (freq > 0)
> > ret = sysfs_emit(buf, "%u\n", freq);
> > else if (cpufreq_driver->setpolicy && cpufreq_driver->get)
> > ret = sysfs_emit(buf, "%u\n", cpufreq_driver->get(policy->cpu));
> > @@ -802,6 +810,19 @@ static ssize_t show_cpuinfo_cur_freq(struct cpufreq_policy *policy,
> > return sysfs_emit(buf, "<unknown>\n");
> > }
> > +/*
> > + * show_cpuinfo_avg_freq - average CPU frequency as detected by hardware
> > + */
> > +static ssize_t show_cpuinfo_avg_freq(struct cpufreq_policy *policy,
> > + char *buf)
> > +{
> > + int avg_freq = arch_freq_get_on_cpu(policy->cpu);
>
> We are seeing issues when reading cpuinfo_avg_freq on an ARM64 system:
>
> $ cat /sys/devices/system/cpu/cpufreq/policy1/cpuinfo_avg_freq
> cat: /sys/devices/system/cpu/cpufreq/policy1/cpuinfo_avg_freq: Resource
> temporarily unavailable
>
> The CPU is in idle state, so arch_freq_get_on_cpu() can't find a good
> alternative source for frequency info.
>
> One way to resolve this is to have fallback methods in
> show_cpuinfo_avg_freq() so it will look like this:
>
> static ssize_t show_cpuinfo_avg_freq(struct cpufreq_policy *policy,
> char *buf)
> {
> int avg_freq = arch_freq_get_on_cpu(policy->cpu);
> int ret;
>
> if (avg_freq > 0)
> ret = sysfs_emit(buf, "%u\n", avg_freq);
> else if (cpufreq_driver->setpolicy && cpufreq_driver->get)
> ret = sysfs_emit(buf, "%u\n", cpufreq_driver->get(policy->cpu));
> else
> ret = sysfs_emit(buf, "%u\n", policy->cur);
> return ret;
> }
>
> But that also makes show_cpuinfo_avg_freq() pretty much the same as
> show_scaling_cur_freq().
>
> So is it possible to consolidate show_cpuinfo_avg_freq() into
> show_scaling_cur_freq(), by making CONFIG_CPUFREQ_ARCH_CUR_FREQ also
> available to ARM64?
>
The whole idea of this patch was to add support for retrieving the average
frequency withouth modifying expected bahaviour of the scaling_cur_freq handler.
The proposed config option is there to keep backward compatibility and allow
userspace tools to catch up with the change - I'd expect that option to be
either set as default to No or droped at some point.
---
BR
Beata
> Kai-Heng
>
> > +
> > + if (avg_freq > 0)
> > + return sysfs_emit(buf, "%u\n", avg_freq);
> > + return avg_freq != 0 ? avg_freq : -EINVAL;
> > +}
> > +
> > /*
> > * show_scaling_governor - show the current policy for the specified CPU
> > */
> > @@ -964,6 +985,7 @@ static ssize_t show_bios_limit(struct cpufreq_policy *policy, char *buf)
> > }
> > cpufreq_freq_attr_ro_perm(cpuinfo_cur_freq, 0400);
> > +cpufreq_freq_attr_ro(cpuinfo_avg_freq);
> > cpufreq_freq_attr_ro(cpuinfo_min_freq);
> > cpufreq_freq_attr_ro(cpuinfo_max_freq);
> > cpufreq_freq_attr_ro(cpuinfo_transition_latency);
> > @@ -1091,6 +1113,12 @@ static int cpufreq_add_dev_interface(struct cpufreq_policy *policy)
> > return ret;
> > }
> > + if (cpufreq_avg_freq_supported(policy)) {
> > + ret = sysfs_create_file(&policy->kobj, &cpuinfo_avg_freq.attr);
> > + if (ret)
> > + return ret;
> > + }
> > +
> > ret = sysfs_create_file(&policy->kobj, &scaling_cur_freq.attr);
> > if (ret)
> > return ret;
> > diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
> > index d4d2f4d1d7cb..a7b6c0ccf9bc 100644
> > --- a/include/linux/cpufreq.h
> > +++ b/include/linux/cpufreq.h
> > @@ -1194,7 +1194,7 @@ static inline int of_perf_domain_get_sharing_cpumask(int pcpu, const char *list_
> > }
> > #endif
> > -extern unsigned int arch_freq_get_on_cpu(int cpu);
> > +extern int arch_freq_get_on_cpu(int cpu);
> > #ifndef arch_set_freq_scale
> > static __always_inline
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v8 1/4] cpufreq: Introduce an optional cpuinfo_avg_freq sysfs entry
2024-12-16 8:33 ` Kai-Heng Feng
@ 2024-12-16 22:32 ` Beata Michalska
0 siblings, 0 replies; 13+ messages in thread
From: Beata Michalska @ 2024-12-16 22:32 UTC (permalink / raw)
To: Kai-Heng Feng
Cc: Sumit Gupta, linux-kernel, linux-arm-kernel, linux-pm,
ionela.voinescu, sudeep.holla, will, catalin.marinas, rafael,
viresh.kumar, yang, vanshikonda, lihuisong, zhanjie9,
Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, H . Peter Anvin, Phil Auld, x86, linux-doc,
Carol Soto, linux-tegra
On Mon, Dec 16, 2024 at 04:33:28PM +0800, Kai-Heng Feng wrote:
> Hi Sumit,
>
> On 2024/12/16 3:11 PM, Sumit Gupta wrote:
> >
> >
> > On 16/12/24 11:13, Kai-Heng Feng wrote:
> > > Hi Beata,
> > >
> > > On 2024/12/6 9:55 PM, Beata Michalska wrote:
> > > > Currently the CPUFreq core exposes two sysfs attributes that can be used
> > > > to query current frequency of a given CPU(s): namely cpuinfo_cur_freq
> > > > and scaling_cur_freq. Both provide slightly different view on the
> > > > subject and they do come with their own drawbacks.
> > > >
> > > > cpuinfo_cur_freq provides higher precision though at a cost of being
> > > > rather expensive. Moreover, the information retrieved via this attribute
> > > > is somewhat short lived as frequency can change at any point of time
> > > > making it difficult to reason from.
> > > >
> > > > scaling_cur_freq, on the other hand, tends to be less accurate but then
> > > > the actual level of precision (and source of information) varies between
> > > > architectures making it a bit ambiguous.
> > > >
> > > > The new attribute, cpuinfo_avg_freq, is intended to provide more stable,
> > > > distinct interface, exposing an average frequency of a given CPU(s), as
> > > > reported by the hardware, over a time frame spanning no more than a few
> > > > milliseconds. As it requires appropriate hardware support, this
> > > > interface is optional.
> > > >
> > > > Note that under the hood, the new attribute relies on the information
> > > > provided by arch_freq_get_on_cpu, which, up to this point, has been
> > > > feeding data for scaling_cur_freq attribute, being the source of
> > > > ambiguity when it comes to interpretation. This has been amended by
> > > > restoring the intended behavior for scaling_cur_freq, with a new
> > > > dedicated config option to maintain status quo for those, who may need
> > > > it.
> > > >
> > > > CC: Jonathan Corbet <corbet@lwn.net>
> > > > CC: Thomas Gleixner <tglx@linutronix.de>
> > > > CC: Ingo Molnar <mingo@redhat.com>
> > > > CC: Borislav Petkov <bp@alien8.de>
> > > > CC: Dave Hansen <dave.hansen@linux.intel.com>
> > > > CC: H. Peter Anvin <hpa@zytor.com>
> > > > CC: Phil Auld <pauld@redhat.com>
> > > > CC: x86@kernel.org
> > > > CC: linux-doc@vger.kernel.org
> > > > Signed-off-by: Beata Michalska <beata.michalska@arm.com>
> > > > ---
> > > > Documentation/admin-guide/pm/cpufreq.rst | 16 ++++++++++-
> > > > arch/x86/kernel/cpu/aperfmperf.c | 2 +-
> > > > arch/x86/kernel/cpu/proc.c | 7 +++--
> > > > drivers/cpufreq/Kconfig.x86 | 12 ++++++++
> > > > drivers/cpufreq/cpufreq.c | 36 +++++++++++++++++++++---
> > > > include/linux/cpufreq.h | 2 +-
> > > > 6 files changed, 66 insertions(+), 9 deletions(-)
> > > >
> > > > diff --git a/Documentation/admin-guide/pm/cpufreq.rst
> > > > b/Documentation/admin- guide/pm/cpufreq.rst
> > > > index fe1be4ad88cb..76f3835afe01 100644
> > > > --- a/Documentation/admin-guide/pm/cpufreq.rst
> > > > +++ b/Documentation/admin-guide/pm/cpufreq.rst
> > > > @@ -248,6 +248,19 @@ are the following:
> > > > If that frequency cannot be determined, this attribute should not
> > > > be present.
> > > > +``cpuinfo_avg_freq``
> > > > + An average frequency (in KHz) of all CPUs belonging to a given policy,
> > > > + derived from a hardware provided feedback and reported on a time frame
> > > > + spanning at most few milliseconds.
> > > > +
> > > > + This is expected to be based on the frequency the
> > > > hardware actually runs
> > > > + at and, as such, might require specialised hardware
> > > > support (such as AMU
> > > > + extension on ARM). If one cannot be determined, this attribute should
> > > > + not be present.
> > > > +
> > > > + Note, that failed attempt to retrieve current frequency for a given
> > > > + CPU(s) will result in an appropriate error.
> > > > +
> > > > ``cpuinfo_max_freq``
> > > > Maximum possible operating frequency the CPUs belonging to this policy
> > > > can run at (in kHz).
> > > > @@ -293,7 +306,8 @@ are the following:
> > > > Some architectures (e.g. ``x86``) may attempt to provide information
> > > > more precisely reflecting the current CPU frequency through this
> > > > attribute, but that still may not be the exact current CPU frequency as
> > > > - seen by the hardware at the moment.
> > > > + seen by the hardware at the moment. This behavior though, is only
> > > > + available via c:macro:``CPUFREQ_ARCH_CUR_FREQ`` option.
> > > > ``scaling_driver``
> > > > The scaling driver currently in use.
> > > > diff --git a/arch/x86/kernel/cpu/aperfmperf.c b/arch/x86/kernel/cpu/aperfmperf.c
> > > > index 0b69bfbf345d..a00059139ca4 100644
> > > > --- a/arch/x86/kernel/cpu/aperfmperf.c
> > > > +++ b/arch/x86/kernel/cpu/aperfmperf.c
> > > > @@ -413,7 +413,7 @@ void arch_scale_freq_tick(void)
> > > > */
> > > > #define MAX_SAMPLE_AGE ((unsigned long)HZ / 50)
> > > > -unsigned int arch_freq_get_on_cpu(int cpu)
> > > > +int arch_freq_get_on_cpu(int cpu)
> > > > {
> > > > struct aperfmperf *s = per_cpu_ptr(&cpu_samples, cpu);
> > > > unsigned int seq, freq;
> > > > diff --git a/arch/x86/kernel/cpu/proc.c b/arch/x86/kernel/cpu/proc.c
> > > > index e65fae63660e..34d8fb93fb70 100644
> > > > --- a/arch/x86/kernel/cpu/proc.c
> > > > +++ b/arch/x86/kernel/cpu/proc.c
> > > > @@ -86,9 +86,12 @@ static int show_cpuinfo(struct seq_file *m, void *v)
> > > > seq_printf(m, "microcode\t: 0x%x\n", c->microcode);
> > > > if (cpu_has(c, X86_FEATURE_TSC)) {
> > > > - unsigned int freq = arch_freq_get_on_cpu(cpu);
> > > > + int freq = arch_freq_get_on_cpu(cpu);
> > > > - seq_printf(m, "cpu MHz\t\t: %u.%03u\n", freq / 1000, (freq % 1000));
> > > > + if (freq <= 0)
> > > > + seq_puts(m, "cpu MHz\t\t: Unknown\n");
> > > > + else
> > > > + seq_printf(m, "cpu MHz\t\t: %u.%03u\n", freq /
> > > > 1000, (freq % 1000));
> > > > }
> > > > /* Cache size */
> > > > diff --git a/drivers/cpufreq/Kconfig.x86 b/drivers/cpufreq/Kconfig.x86
> > > > index 97c2d4f15d76..212e1b9afe21 100644
> > > > --- a/drivers/cpufreq/Kconfig.x86
> > > > +++ b/drivers/cpufreq/Kconfig.x86
> > > > @@ -340,3 +340,15 @@ config X86_SPEEDSTEP_RELAXED_CAP_CHECK
> > > > option lets the probing code bypass some of those checks if the
> > > > parameter "relaxed_check=1" is passed to the module.
> > > > +config CPUFREQ_ARCH_CUR_FREQ
> > > > + default y
> > > > + bool "Current frequency derived from HW provided feedback"
> > > > + help
> > > > + This determines whether the scaling_cur_freq sysfs attribute returns
> > > > + the last requested frequency or a more precise value based on hardware
> > > > + provided feedback (as architected counters).
> > > > + Given that a more precise frequency can now be provided via the
> > > > + cpuinfo_avg_cur_freq attribute, by enabling this option,
> > > > + scaling_cur_freq maintains the provision of a counter based frequency,
> > > > + for compatibility reasons.
> > > > +
> > > > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> > > > index 04fc786dd2c0..70df2a24437b 100644
> > > > --- a/drivers/cpufreq/cpufreq.c
> > > > +++ b/drivers/cpufreq/cpufreq.c
> > > > @@ -747,9 +747,14 @@ show_one(cpuinfo_transition_latency,
> > > > cpuinfo.transition_latency);
> > > > show_one(scaling_min_freq, min);
> > > > show_one(scaling_max_freq, max);
> > > > -__weak unsigned int arch_freq_get_on_cpu(int cpu)
> > > > +__weak int arch_freq_get_on_cpu(int cpu)
> > > > {
> > > > - return 0;
> > > > + return -EOPNOTSUPP;
> > > > +}
> > > > +
> > > > +static inline bool cpufreq_avg_freq_supported(struct cpufreq_policy *policy)
> > > > +{
> > > > + return arch_freq_get_on_cpu(policy->cpu) != -EOPNOTSUPP;
> > > > }
> > > > static ssize_t show_scaling_cur_freq(struct cpufreq_policy *policy, char *buf)
> > > > @@ -757,8 +762,11 @@ static ssize_t show_scaling_cur_freq(struct
> > > > cpufreq_policy *policy, char *buf)
> > > > ssize_t ret;
> > > > unsigned int freq;
> > > > - freq = arch_freq_get_on_cpu(policy->cpu);
> > > > - if (freq)
> > > > + freq = IS_ENABLED(CONFIG_CPUFREQ_ARCH_CUR_FREQ)
> > > > + ? arch_freq_get_on_cpu(policy->cpu)
> > > > + : 0;
> > > > +
> > > > + if (freq > 0)
> > > > ret = sysfs_emit(buf, "%u\n", freq);
> > > > else if (cpufreq_driver->setpolicy && cpufreq_driver->get)
> > > > ret = sysfs_emit(buf, "%u\n", cpufreq_driver->get(policy->cpu));
> > > > @@ -802,6 +810,19 @@ static ssize_t show_cpuinfo_cur_freq(struct
> > > > cpufreq_policy *policy,
> > > > return sysfs_emit(buf, "<unknown>\n");
> > > > }
> > > > +/*
> > > > + * show_cpuinfo_avg_freq - average CPU frequency as detected by hardware
> > > > + */
> > > > +static ssize_t show_cpuinfo_avg_freq(struct cpufreq_policy *policy,
> > > > + char *buf)
> > > > +{
> > > > + int avg_freq = arch_freq_get_on_cpu(policy->cpu);
> > >
> > > We are seeing issues when reading cpuinfo_avg_freq on an ARM64 system:
> > >
> > > $ cat /sys/devices/system/cpu/cpufreq/policy1/cpuinfo_avg_freq
> > > cat: /sys/devices/system/cpu/cpufreq/policy1/cpuinfo_avg_freq:
> > > Resource temporarily unavailable
> > >
> > > The CPU is in idle state, so arch_freq_get_on_cpu() can't find a
> > > good alternative source for frequency info.
> > >
> >
> > Hi Kai Heng,
> > This has already been discussed during v7 in [1] & [2].
>
> Thanks for the info!
@Sumit: Thank you indeed.
>
> > In v7, we were returning zero which printed 'unknown'.
> > The discussion was about printing in more descriptive way or with an
> > appropriate error code. In v8 we are returning 'EAGAIN' instead of zero.
> > The final decision was of Maintainers.
>
> Is there any cpufreq driver that prints "unknown" or error when CPU is in idle?
I think at this point, for cpuinfo_cur_freq, one gets either the frequency value
or 'unknown'. I'm not sure whether there are any drivers that report error upon
'get'.
>
> I think it's more unsurprising to print the lowest CPU frequency when CPU is
> in idle state, instead of any other error code.
With that approach one cannot easily determine whether the CPU is actually
running at the lowest frequency or it's idle.
Returning an error also avoids mixing types of data provided by the handler.
---
BR
Beata
>
> Kai-Heng
>
> >
> > Viresh,
> > You have any preference on this?
> >
> > [1] https://lore.kernel.org/lkml/aa254516-968e-4665-bb5b-981c296ffc35@nvidia.com/#t
> > [2] https://lore.kernel.org/lkml/Zyh-uVSW-0d0r8oB@arm.com/
> >
> > Thank you,
> > Sumit Gupta
> >
> > > One way to resolve this is to have fallback methods in
> > > show_cpuinfo_avg_freq() so it will look like this:
> > >
> > > static ssize_t show_cpuinfo_avg_freq(struct cpufreq_policy *policy,
> > > char *buf)
> > > {
> > > int avg_freq = arch_freq_get_on_cpu(policy->cpu);
> > > int ret;
> > >
> > > if (avg_freq > 0)
> > > ret = sysfs_emit(buf, "%u\n", avg_freq);
> > > else if (cpufreq_driver->setpolicy && cpufreq_driver->get)
> > > ret = sysfs_emit(buf, "%u\n", cpufreq_driver->get(policy->cpu));
> > > else
> > > ret = sysfs_emit(buf, "%u\n", policy->cur);
> > > return ret;
> > > }
> > >
> > > But that also makes show_cpuinfo_avg_freq() pretty much the same as
> > > show_scaling_cur_freq().
> > >
> > > So is it possible to consolidate show_cpuinfo_avg_freq() into
> > > show_scaling_cur_freq(), by making CONFIG_CPUFREQ_ARCH_CUR_FREQ also
> > > available to ARM64?
> > >
> > > Kai-Heng
> > >
> > > > +
> > > > + if (avg_freq > 0)
> > > > + return sysfs_emit(buf, "%u\n", avg_freq);
> > > > + return avg_freq != 0 ? avg_freq : -EINVAL;
> > > > +}
> > > > +
> > > > /*
> > > > * show_scaling_governor - show the current policy for the specified CPU
> > > > */
> > > > @@ -964,6 +985,7 @@ static ssize_t show_bios_limit(struct
> > > > cpufreq_policy *policy, char *buf)
> > > > }
> > > > cpufreq_freq_attr_ro_perm(cpuinfo_cur_freq, 0400);
> > > > +cpufreq_freq_attr_ro(cpuinfo_avg_freq);
> > > > cpufreq_freq_attr_ro(cpuinfo_min_freq);
> > > > cpufreq_freq_attr_ro(cpuinfo_max_freq);
> > > > cpufreq_freq_attr_ro(cpuinfo_transition_latency);
> > > > @@ -1091,6 +1113,12 @@ static int
> > > > cpufreq_add_dev_interface(struct cpufreq_policy *policy)
> > > > return ret;
> > > > }
> > > > + if (cpufreq_avg_freq_supported(policy)) {
> > > > + ret = sysfs_create_file(&policy->kobj, &cpuinfo_avg_freq.attr);
> > > > + if (ret)
> > > > + return ret;
> > > > + }
> > > > +
> > > > ret = sysfs_create_file(&policy->kobj, &scaling_cur_freq.attr);
> > > > if (ret)
> > > > return ret;
> > > > diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
> > > > index d4d2f4d1d7cb..a7b6c0ccf9bc 100644
> > > > --- a/include/linux/cpufreq.h
> > > > +++ b/include/linux/cpufreq.h
> > > > @@ -1194,7 +1194,7 @@ static inline int
> > > > of_perf_domain_get_sharing_cpumask(int pcpu, const char *list_
> > > > }
> > > > #endif
> > > > -extern unsigned int arch_freq_get_on_cpu(int cpu);
> > > > +extern int arch_freq_get_on_cpu(int cpu);
> > > > #ifndef arch_set_freq_scale
> > > > static __always_inline
> > >
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v8 1/4] cpufreq: Introduce an optional cpuinfo_avg_freq sysfs entry
2024-12-16 22:15 ` Beata Michalska
@ 2024-12-17 4:27 ` Viresh Kumar
2024-12-17 20:10 ` Beata Michalska
0 siblings, 1 reply; 13+ messages in thread
From: Viresh Kumar @ 2024-12-17 4:27 UTC (permalink / raw)
To: Beata Michalska
Cc: linux-kernel, linux-arm-kernel, linux-pm, ionela.voinescu,
sudeep.holla, will, catalin.marinas, rafael, sumitg, yang,
vanshikonda, lihuisong, zhanjie9, Jonathan Corbet,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
H . Peter Anvin, Phil Auld, x86, linux-doc
On 16-12-24, 23:15, Beata Michalska wrote:
> My bad as I must have misinterpreted that message. Although I am not entirely
> sure why this might be unacceptable as it is not such uncommon approach to use
> signed int space to cover both: expected positive value as well as potential
> error code case failure.
This part is fine. The problem is with handling frequency here. Signed int can
capture up to 2 GHz of freq, where as unsigned int can capture up to 4 GHz and
so we would really like to keep it at 4 GHz..
Maybe we need to move to 64 bits for frequency at some point of time, but at
least we should try to not break it for now.
> Enabling the new attribute for all is an option, tough not entirely compelling
> one as exposing a feature that is known not to be supported seems bit
> counterintuitive. On the other hand using cpufreq driver flags won't help much
> as the support for the new attrib is platform-specific, not driver-specific.
--
viresh
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v8 1/4] cpufreq: Introduce an optional cpuinfo_avg_freq sysfs entry
2024-12-17 4:27 ` Viresh Kumar
@ 2024-12-17 20:10 ` Beata Michalska
2024-12-18 4:11 ` Viresh Kumar
0 siblings, 1 reply; 13+ messages in thread
From: Beata Michalska @ 2024-12-17 20:10 UTC (permalink / raw)
To: Viresh Kumar
Cc: linux-kernel, linux-arm-kernel, linux-pm, ionela.voinescu,
sudeep.holla, will, catalin.marinas, rafael, sumitg, yang,
vanshikonda, lihuisong, zhanjie9, Jonathan Corbet,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
H . Peter Anvin, Phil Auld, x86, linux-doc
On Tue, Dec 17, 2024 at 09:57:26AM +0530, Viresh Kumar wrote:
> On 16-12-24, 23:15, Beata Michalska wrote:
> > My bad as I must have misinterpreted that message. Although I am not entirely
> > sure why this might be unacceptable as it is not such uncommon approach to use
> > signed int space to cover both: expected positive value as well as potential
> > error code case failure.
>
> This part is fine. The problem is with handling frequency here. Signed int can
> capture up to 2 GHz of freq, where as unsigned int can capture up to 4 GHz and
> so we would really like to keep it at 4 GHz..
Right, though the arch_freq_get_on_cpu operates on kHz values.
---
BR
Beata
>
> Maybe we need to move to 64 bits for frequency at some point of time, but at
> least we should try to not break it for now.
>
> > Enabling the new attribute for all is an option, tough not entirely compelling
> > one as exposing a feature that is known not to be supported seems bit
> > counterintuitive. On the other hand using cpufreq driver flags won't help much
> > as the support for the new attrib is platform-specific, not driver-specific.
>
> --
> viresh
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v8 1/4] cpufreq: Introduce an optional cpuinfo_avg_freq sysfs entry
2024-12-17 20:10 ` Beata Michalska
@ 2024-12-18 4:11 ` Viresh Kumar
2024-12-19 11:57 ` Beata Michalska
0 siblings, 1 reply; 13+ messages in thread
From: Viresh Kumar @ 2024-12-18 4:11 UTC (permalink / raw)
To: Beata Michalska
Cc: linux-kernel, linux-arm-kernel, linux-pm, ionela.voinescu,
sudeep.holla, will, catalin.marinas, rafael, sumitg, yang,
vanshikonda, lihuisong, zhanjie9, Jonathan Corbet,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
H . Peter Anvin, Phil Auld, x86, linux-doc
On 17-12-24, 21:10, Beata Michalska wrote:
> On Tue, Dec 17, 2024 at 09:57:26AM +0530, Viresh Kumar wrote:
> > On 16-12-24, 23:15, Beata Michalska wrote:
> > > My bad as I must have misinterpreted that message. Although I am not entirely
> > > sure why this might be unacceptable as it is not such uncommon approach to use
> > > signed int space to cover both: expected positive value as well as potential
> > > error code case failure.
> >
> > This part is fine. The problem is with handling frequency here. Signed int can
> > capture up to 2 GHz of freq, where as unsigned int can capture up to 4 GHz and
> > so we would really like to keep it at 4 GHz..
> Right, though the arch_freq_get_on_cpu operates on kHz values.
Hmm.. Missed that.
If you still want to keep it, make that change in a separate patch and
the new sysfs entry in a different one, so related people can easily
review.
--
viresh
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v8 1/4] cpufreq: Introduce an optional cpuinfo_avg_freq sysfs entry
2024-12-18 4:11 ` Viresh Kumar
@ 2024-12-19 11:57 ` Beata Michalska
0 siblings, 0 replies; 13+ messages in thread
From: Beata Michalska @ 2024-12-19 11:57 UTC (permalink / raw)
To: Viresh Kumar
Cc: linux-kernel, linux-arm-kernel, linux-pm, ionela.voinescu,
sudeep.holla, will, catalin.marinas, rafael, sumitg, yang,
vanshikonda, lihuisong, zhanjie9, Jonathan Corbet,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
H . Peter Anvin, Phil Auld, x86, linux-doc
On Wed, Dec 18, 2024 at 09:41:27AM +0530, Viresh Kumar wrote:
> On 17-12-24, 21:10, Beata Michalska wrote:
> > On Tue, Dec 17, 2024 at 09:57:26AM +0530, Viresh Kumar wrote:
> > > On 16-12-24, 23:15, Beata Michalska wrote:
> > > > My bad as I must have misinterpreted that message. Although I am not entirely
> > > > sure why this might be unacceptable as it is not such uncommon approach to use
> > > > signed int space to cover both: expected positive value as well as potential
> > > > error code case failure.
> > >
> > > This part is fine. The problem is with handling frequency here. Signed int can
> > > capture up to 2 GHz of freq, where as unsigned int can capture up to 4 GHz and
> > > so we would really like to keep it at 4 GHz..
> > Right, though the arch_freq_get_on_cpu operates on kHz values.
>
> Hmm.. Missed that.
>
> If you still want to keep it, make that change in a separate patch and
> the new sysfs entry in a different one, so related people can easily
> review.
>
Will do.
Thank you for your feedback. Much appreciated.
---
BR
Beata
> --
> viresh
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2024-12-19 11:58 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-06 13:55 [PATCH v8 0/4] Add support for AArch64 AMUv1-based average freq Beata Michalska
2024-12-06 13:55 ` [PATCH v8 1/4] cpufreq: Introduce an optional cpuinfo_avg_freq sysfs entry Beata Michalska
2024-12-12 6:51 ` Viresh Kumar
2024-12-16 22:15 ` Beata Michalska
2024-12-17 4:27 ` Viresh Kumar
2024-12-17 20:10 ` Beata Michalska
2024-12-18 4:11 ` Viresh Kumar
2024-12-19 11:57 ` Beata Michalska
2024-12-16 5:43 ` Kai-Heng Feng
2024-12-16 7:11 ` Sumit Gupta
2024-12-16 8:33 ` Kai-Heng Feng
2024-12-16 22:32 ` Beata Michalska
2024-12-16 22:21 ` Beata Michalska
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).