* [PATCH v1 1/3] cpufreq: CPPC: Add cppc_cpufreq_search_cpu_data
2022-03-17 13:34 [PATCH v1 0/3] Enable EAS for CPPC/ACPI based systems Pierre Gondois
@ 2022-03-17 13:34 ` Pierre Gondois
2022-03-17 14:20 ` Marc Zyngier
2022-03-17 13:34 ` [PATCH v1 2/3] cpufreq: CPPC: Add per_cpu efficiency_class Pierre Gondois
2022-03-17 13:34 ` [PATCH v1 3/3] cpufreq: CPPC: Register EM based on efficiency class information Pierre Gondois
2 siblings, 1 reply; 10+ messages in thread
From: Pierre Gondois @ 2022-03-17 13:34 UTC (permalink / raw)
To: linux-kernel
Cc: Ionela.Voinescu, Lukasz.Luba, Morten.Rasmussen, Dietmar.Eggemann,
mka, daniel.lezcano, Pierre Gondois, Catalin Marinas, Will Deacon,
Rafael J. Wysocki, Viresh Kumar, Mark Rutland, Ard Biesheuvel,
Fuad Tabba, Marc Zyngier, Valentin Schneider, Rob Herring,
linux-arm-kernel, linux-pm
cppc_cpufreq_get_cpu_data() allocates a new struct cppc_cpudata
for the input CPU at each call.
To search the struct associated with a cpu without allocating
a new one, add cppc_cpufreq_search_cpu_data().
Also add an early prototype.
This will be used in a later patch, when generating artificial
performance states to register an artificial Energy Model in the
cppc_cpufreq driver and enable the Energy Aware Scheduler for ACPI
based systems.
Signed-off-by: Pierre Gondois <Pierre.Gondois@arm.com>
---
drivers/cpufreq/cppc_cpufreq.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
index 82d370ae6a4a..8f950fe72765 100644
--- a/drivers/cpufreq/cppc_cpufreq.c
+++ b/drivers/cpufreq/cppc_cpufreq.c
@@ -41,6 +41,8 @@
*/
static LIST_HEAD(cpu_data_list);
+static struct cppc_cpudata *cppc_cpufreq_search_cpu_data(unsigned int cpu);
+
static bool boost_supported;
struct cppc_workaround_oem_info {
@@ -479,6 +481,19 @@ static void cppc_cpufreq_put_cpu_data(struct cpufreq_policy *policy)
policy->driver_data = NULL;
}
+static inline struct cppc_cpudata *
+cppc_cpufreq_search_cpu_data(unsigned int cpu)
+{
+ struct cppc_cpudata *iter, *tmp;
+
+ list_for_each_entry_safe(iter, tmp, &cpu_data_list, node) {
+ if (cpumask_test_cpu(cpu, iter->shared_cpu_map))
+ return iter;
+ }
+
+ return NULL;
+}
+
static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
{
unsigned int cpu = policy->cpu;
--
2.25.1
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [PATCH v1 1/3] cpufreq: CPPC: Add cppc_cpufreq_search_cpu_data
2022-03-17 13:34 ` [PATCH v1 1/3] cpufreq: CPPC: Add cppc_cpufreq_search_cpu_data Pierre Gondois
@ 2022-03-17 14:20 ` Marc Zyngier
2022-03-17 14:44 ` Pierre Gondois
0 siblings, 1 reply; 10+ messages in thread
From: Marc Zyngier @ 2022-03-17 14:20 UTC (permalink / raw)
To: Pierre Gondois
Cc: linux-kernel, Ionela.Voinescu, Lukasz.Luba, Morten.Rasmussen,
Dietmar.Eggemann, mka, daniel.lezcano, Catalin Marinas,
Will Deacon, Rafael J. Wysocki, Viresh Kumar, Mark Rutland,
Ard Biesheuvel, Fuad Tabba, Valentin Schneider, Rob Herring,
linux-arm-kernel, linux-pm
On 2022-03-17 13:34, Pierre Gondois wrote:
> cppc_cpufreq_get_cpu_data() allocates a new struct cppc_cpudata
> for the input CPU at each call.
>
> To search the struct associated with a cpu without allocating
> a new one, add cppc_cpufreq_search_cpu_data().
> Also add an early prototype.
>
> This will be used in a later patch, when generating artificial
> performance states to register an artificial Energy Model in the
> cppc_cpufreq driver and enable the Energy Aware Scheduler for ACPI
> based systems.
>
> Signed-off-by: Pierre Gondois <Pierre.Gondois@arm.com>
> ---
> drivers/cpufreq/cppc_cpufreq.c | 15 +++++++++++++++
> 1 file changed, 15 insertions(+)
>
> diff --git a/drivers/cpufreq/cppc_cpufreq.c
> b/drivers/cpufreq/cppc_cpufreq.c
> index 82d370ae6a4a..8f950fe72765 100644
> --- a/drivers/cpufreq/cppc_cpufreq.c
> +++ b/drivers/cpufreq/cppc_cpufreq.c
> @@ -41,6 +41,8 @@
> */
> static LIST_HEAD(cpu_data_list);
>
> +static struct cppc_cpudata *cppc_cpufreq_search_cpu_data(unsigned int
> cpu);
> +
> static bool boost_supported;
>
> struct cppc_workaround_oem_info {
> @@ -479,6 +481,19 @@ static void cppc_cpufreq_put_cpu_data(struct
> cpufreq_policy *policy)
> policy->driver_data = NULL;
> }
>
> +static inline struct cppc_cpudata *
Why the inline? This is hardly performance critical, and if
it is, you want something better than iterating over a list.
> +cppc_cpufreq_search_cpu_data(unsigned int cpu)
> +{
> + struct cppc_cpudata *iter, *tmp;
> +
> + list_for_each_entry_safe(iter, tmp, &cpu_data_list, node) {
> + if (cpumask_test_cpu(cpu, iter->shared_cpu_map))
> + return iter;
> + }
> +
> + return NULL;
> +}
> +
> static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
> {
> unsigned int cpu = policy->cpu;
Thanks,
M.
--
Jazz is not dead. It just smells funny...
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH v1 1/3] cpufreq: CPPC: Add cppc_cpufreq_search_cpu_data
2022-03-17 14:20 ` Marc Zyngier
@ 2022-03-17 14:44 ` Pierre Gondois
2022-03-17 15:17 ` Marc Zyngier
0 siblings, 1 reply; 10+ messages in thread
From: Pierre Gondois @ 2022-03-17 14:44 UTC (permalink / raw)
To: Marc Zyngier
Cc: linux-kernel, Ionela.Voinescu, Lukasz.Luba, Morten.Rasmussen,
Dietmar.Eggemann, mka, daniel.lezcano, Catalin Marinas,
Will Deacon, Rafael J. Wysocki, Viresh Kumar, Mark Rutland,
Ard Biesheuvel, Fuad Tabba, Rob Herring, linux-arm-kernel,
linux-pm
On 3/17/22 15:20, Marc Zyngier wrote:
> On 2022-03-17 13:34, Pierre Gondois wrote:
>> cppc_cpufreq_get_cpu_data() allocates a new struct cppc_cpudata
>> for the input CPU at each call.
>>
>> To search the struct associated with a cpu without allocating
>> a new one, add cppc_cpufreq_search_cpu_data().
>> Also add an early prototype.
>>
>> This will be used in a later patch, when generating artificial
>> performance states to register an artificial Energy Model in the
>> cppc_cpufreq driver and enable the Energy Aware Scheduler for ACPI
>> based systems.
>>
>> Signed-off-by: Pierre Gondois <Pierre.Gondois@arm.com>
>> ---
>> drivers/cpufreq/cppc_cpufreq.c | 15 +++++++++++++++
>> 1 file changed, 15 insertions(+)
>>
>> diff --git a/drivers/cpufreq/cppc_cpufreq.c
>> b/drivers/cpufreq/cppc_cpufreq.c
>> index 82d370ae6a4a..8f950fe72765 100644
>> --- a/drivers/cpufreq/cppc_cpufreq.c
>> +++ b/drivers/cpufreq/cppc_cpufreq.c
>> @@ -41,6 +41,8 @@
>> */
>> static LIST_HEAD(cpu_data_list);
>>
>> +static struct cppc_cpudata *cppc_cpufreq_search_cpu_data(unsigned int
>> cpu);
>> +
>> static bool boost_supported;
>>
>> struct cppc_workaround_oem_info {
>> @@ -479,6 +481,19 @@ static void cppc_cpufreq_put_cpu_data(struct
>> cpufreq_policy *policy)
>> policy->driver_data = NULL;
>> }
>>
>> +static inline struct cppc_cpudata *
>
> Why the inline? This is hardly performance critical, and if
> it is, you want something better than iterating over a list.
This was made inline mainly because the function was small. The function
is called only at boot, so it should not be performance critical. The
'inline' can be removed if necessary.
Would letting it inlined have a negative impact ?
>
>> +cppc_cpufreq_search_cpu_data(unsigned int cpu)
>> +{
>> + struct cppc_cpudata *iter, *tmp;
>> +
>> + list_for_each_entry_safe(iter, tmp, &cpu_data_list, node) {
>> + if (cpumask_test_cpu(cpu, iter->shared_cpu_map))
>> + return iter;
>> + }
>> +
>> + return NULL;
>> +}
>> +
>> static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
>> {
>> unsigned int cpu = policy->cpu;
>
> Thanks,
>
> M.
Regards,
Pierre
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH v1 1/3] cpufreq: CPPC: Add cppc_cpufreq_search_cpu_data
2022-03-17 14:44 ` Pierre Gondois
@ 2022-03-17 15:17 ` Marc Zyngier
0 siblings, 0 replies; 10+ messages in thread
From: Marc Zyngier @ 2022-03-17 15:17 UTC (permalink / raw)
To: Pierre Gondois
Cc: linux-kernel, Ionela.Voinescu, Lukasz.Luba, Morten.Rasmussen,
Dietmar.Eggemann, mka, daniel.lezcano, Catalin Marinas,
Will Deacon, Rafael J. Wysocki, Viresh Kumar, Mark Rutland,
Ard Biesheuvel, Fuad Tabba, Rob Herring, linux-arm-kernel,
linux-pm
On 2022-03-17 14:44, Pierre Gondois wrote:
> On 3/17/22 15:20, Marc Zyngier wrote:
>> On 2022-03-17 13:34, Pierre Gondois wrote:
>>> cppc_cpufreq_get_cpu_data() allocates a new struct cppc_cpudata
>>> for the input CPU at each call.
>>>
>>> To search the struct associated with a cpu without allocating
>>> a new one, add cppc_cpufreq_search_cpu_data().
>>> Also add an early prototype.
>>>
>>> This will be used in a later patch, when generating artificial
>>> performance states to register an artificial Energy Model in the
>>> cppc_cpufreq driver and enable the Energy Aware Scheduler for ACPI
>>> based systems.
>>>
>>> Signed-off-by: Pierre Gondois <Pierre.Gondois@arm.com>
>>> ---
>>> drivers/cpufreq/cppc_cpufreq.c | 15 +++++++++++++++
>>> 1 file changed, 15 insertions(+)
>>>
>>> diff --git a/drivers/cpufreq/cppc_cpufreq.c
>>> b/drivers/cpufreq/cppc_cpufreq.c
>>> index 82d370ae6a4a..8f950fe72765 100644
>>> --- a/drivers/cpufreq/cppc_cpufreq.c
>>> +++ b/drivers/cpufreq/cppc_cpufreq.c
>>> @@ -41,6 +41,8 @@
>>> */
>>> static LIST_HEAD(cpu_data_list);
>>>
>>> +static struct cppc_cpudata *cppc_cpufreq_search_cpu_data(unsigned
>>> int
>>> cpu);
>>> +
>>> static bool boost_supported;
>>>
>>> struct cppc_workaround_oem_info {
>>> @@ -479,6 +481,19 @@ static void cppc_cpufreq_put_cpu_data(struct
>>> cpufreq_policy *policy)
>>> policy->driver_data = NULL;
>>> }
>>>
>>> +static inline struct cppc_cpudata *
>>
>> Why the inline? This is hardly performance critical, and if
>> it is, you want something better than iterating over a list.
>
> This was made inline mainly because the function was small. The
> function
> is called only at boot, so it should not be performance critical. The
> 'inline' can be removed if necessary.
> Would letting it inlined have a negative impact ?
This is why we have a compiler. It is perfectly able to decide
on its own whether to inline or not, depending on how it can
optimise it. With modern compilers, 'inline' means nothing anyway,
and is ignored most of the time.
So dropping it will at least save 7 bytes of source code! ;-)
M.
--
Jazz is not dead. It just smells funny...
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v1 2/3] cpufreq: CPPC: Add per_cpu efficiency_class
2022-03-17 13:34 [PATCH v1 0/3] Enable EAS for CPPC/ACPI based systems Pierre Gondois
2022-03-17 13:34 ` [PATCH v1 1/3] cpufreq: CPPC: Add cppc_cpufreq_search_cpu_data Pierre Gondois
@ 2022-03-17 13:34 ` Pierre Gondois
2022-03-17 15:13 ` Marc Zyngier
2022-03-17 13:34 ` [PATCH v1 3/3] cpufreq: CPPC: Register EM based on efficiency class information Pierre Gondois
2 siblings, 1 reply; 10+ messages in thread
From: Pierre Gondois @ 2022-03-17 13:34 UTC (permalink / raw)
To: linux-kernel
Cc: Ionela.Voinescu, Lukasz.Luba, Morten.Rasmussen, Dietmar.Eggemann,
mka, daniel.lezcano, Pierre Gondois, Catalin Marinas, Will Deacon,
Rafael J. Wysocki, Viresh Kumar, Mark Rutland, Ard Biesheuvel,
Fuad Tabba, Marc Zyngier, Valentin Schneider, Rob Herring,
linux-arm-kernel, linux-pm
In ACPI, describing power efficiency of CPUs can be done through the
following arm specific field:
ACPI 6.4, s5.2.12.14 'GIC CPU Interface (GICC) Structure',
'Processor Power Efficiency Class field':
Describes the relative power efficiency of the associated pro-
cessor. Lower efficiency class numbers are more efficient than
higher ones (e.g. efficiency class 0 should be treated as more
efficient than efficiency class 1). However, absolute values
of this number have no meaning: 2 isn’t necessarily half as
efficient as 1.
The efficiency_class field is stored in the GicC structure of the
ACPI MADT table and it's currently supported in Linux for arm64 only.
Thus, this new functionality is introduced for arm64 only.
To allow the cppc_cpufreq driver to know and preprocess the
efficiency_class values of all the CPUs, add a per_cpu efficiency_class
variable to store them. Also add a static efficiency_class_populated
to let the driver know efficiency_class values are usable and register
an artificial Energy Model (EM) based on normalized class values.
At least 2 different efficiency classes must be present,
otherwise there is no use in creating an Energy Model.
The efficiency_class values are squeezed in [0:#efficiency_class-1]
while conserving the order. For instance, efficiency classes of:
[111, 212, 250]
will be mapped to:
[0 (was 111), 1 (was 212), 2 (was 250)].
Each policy being independently registered in the driver, populating
the per_cpu efficiency_class is done only once at the driver
initialization. This prevents from having each policy re-searching the
efficiency_class values of other CPUs.
The patch also exports acpi_cpu_get_madt_gicc() to fetch the GicC
structure of the ACPI MADT table for each CPU.
Signed-off-by: Pierre Gondois <Pierre.Gondois@arm.com>
---
arch/arm64/kernel/smp.c | 1 +
drivers/cpufreq/cppc_cpufreq.c | 55 ++++++++++++++++++++++++++++++++++
2 files changed, 56 insertions(+)
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 27df5c1e6baa..56637cbea5d6 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -512,6 +512,7 @@ struct acpi_madt_generic_interrupt *acpi_cpu_get_madt_gicc(int cpu)
{
return &cpu_madt_gicc[cpu];
}
+EXPORT_SYMBOL(acpi_cpu_get_madt_gicc);
/*
* acpi_map_gic_cpu_interface - parse processor MADT entry
diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
index 8f950fe72765..a6cd95c3b474 100644
--- a/drivers/cpufreq/cppc_cpufreq.c
+++ b/drivers/cpufreq/cppc_cpufreq.c
@@ -422,12 +422,66 @@ static unsigned int cppc_cpufreq_get_transition_delay_us(unsigned int cpu)
return cppc_get_transition_latency(cpu) / NSEC_PER_USEC;
}
+static bool efficiency_class_populated;
+static DEFINE_PER_CPU(unsigned int, efficiency_class);
+
+static int populate_efficiency_class(void)
+{
+ unsigned int min = UINT_MAX, max = 0, class;
+ struct acpi_madt_generic_interrupt *gicc;
+ int cpu;
+
+ for_each_possible_cpu(cpu) {
+ gicc = acpi_cpu_get_madt_gicc(cpu);
+ if (!gicc)
+ return -ENODEV;
+
+ per_cpu(efficiency_class, cpu) = gicc->efficiency_class;
+ min = min_t(unsigned int, min, gicc->efficiency_class);
+ max = max_t(unsigned int, max, gicc->efficiency_class);
+ }
+
+ if (min == max) {
+ pr_debug("Efficiency classes are all equal (=%d). "
+ "No EM registered", max);
+ return -EINVAL;
+ }
+
+ /*
+ * Squeeze efficiency class values on [0:#efficiency_class-1].
+ * Values are per spec in [0:255].
+ */
+ for (class = 0; class < 256; class++) {
+ unsigned int new_min, curr;
+
+ new_min = UINT_MAX;
+ for_each_possible_cpu(cpu) {
+ curr = per_cpu(efficiency_class, cpu);
+ if (curr == min)
+ per_cpu(efficiency_class, cpu) = class;
+ else if (curr > min)
+ new_min = min(new_min, curr);
+ }
+
+ if (new_min == UINT_MAX)
+ break;
+ min = new_min;
+ }
+
+ efficiency_class_populated = true;
+ return 0;
+}
+
#else
static unsigned int cppc_cpufreq_get_transition_delay_us(unsigned int cpu)
{
return cppc_get_transition_latency(cpu) / NSEC_PER_USEC;
}
+static int populate_efficiency_class(void)
+{
+ return 0;
+}
#endif
@@ -757,6 +811,7 @@ static int __init cppc_cpufreq_init(void)
cppc_check_hisi_workaround();
cppc_freq_invariance_init();
+ populate_efficiency_class();
ret = cpufreq_register_driver(&cppc_cpufreq_driver);
if (ret)
--
2.25.1
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [PATCH v1 2/3] cpufreq: CPPC: Add per_cpu efficiency_class
2022-03-17 13:34 ` [PATCH v1 2/3] cpufreq: CPPC: Add per_cpu efficiency_class Pierre Gondois
@ 2022-03-17 15:13 ` Marc Zyngier
2022-03-17 16:07 ` Pierre Gondois
0 siblings, 1 reply; 10+ messages in thread
From: Marc Zyngier @ 2022-03-17 15:13 UTC (permalink / raw)
To: Pierre Gondois
Cc: linux-kernel, Ionela.Voinescu, Lukasz.Luba, Morten.Rasmussen,
Dietmar.Eggemann, mka, daniel.lezcano, Catalin Marinas,
Will Deacon, Rafael J. Wysocki, Viresh Kumar, Mark Rutland,
Ard Biesheuvel, Fuad Tabba, Valentin Schneider, Rob Herring,
linux-arm-kernel, linux-pm
On 2022-03-17 13:34, Pierre Gondois wrote:
> In ACPI, describing power efficiency of CPUs can be done through the
> following arm specific field:
> ACPI 6.4, s5.2.12.14 'GIC CPU Interface (GICC) Structure',
> 'Processor Power Efficiency Class field':
> Describes the relative power efficiency of the associated pro-
> cessor. Lower efficiency class numbers are more efficient than
> higher ones (e.g. efficiency class 0 should be treated as more
> efficient than efficiency class 1). However, absolute values
> of this number have no meaning: 2 isn’t necessarily half as
> efficient as 1.
>
> The efficiency_class field is stored in the GicC structure of the
> ACPI MADT table and it's currently supported in Linux for arm64 only.
> Thus, this new functionality is introduced for arm64 only.
>
> To allow the cppc_cpufreq driver to know and preprocess the
> efficiency_class values of all the CPUs, add a per_cpu efficiency_class
> variable to store them. Also add a static efficiency_class_populated
> to let the driver know efficiency_class values are usable and register
> an artificial Energy Model (EM) based on normalized class values.
>
> At least 2 different efficiency classes must be present,
> otherwise there is no use in creating an Energy Model.
>
> The efficiency_class values are squeezed in [0:#efficiency_class-1]
> while conserving the order. For instance, efficiency classes of:
> [111, 212, 250]
> will be mapped to:
> [0 (was 111), 1 (was 212), 2 (was 250)].
>
> Each policy being independently registered in the driver, populating
> the per_cpu efficiency_class is done only once at the driver
> initialization. This prevents from having each policy re-searching the
> efficiency_class values of other CPUs.
>
> The patch also exports acpi_cpu_get_madt_gicc() to fetch the GicC
> structure of the ACPI MADT table for each CPU.
>
> Signed-off-by: Pierre Gondois <Pierre.Gondois@arm.com>
> ---
> arch/arm64/kernel/smp.c | 1 +
> drivers/cpufreq/cppc_cpufreq.c | 55 ++++++++++++++++++++++++++++++++++
> 2 files changed, 56 insertions(+)
>
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index 27df5c1e6baa..56637cbea5d6 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -512,6 +512,7 @@ struct acpi_madt_generic_interrupt
> *acpi_cpu_get_madt_gicc(int cpu)
> {
> return &cpu_madt_gicc[cpu];
> }
> +EXPORT_SYMBOL(acpi_cpu_get_madt_gicc);
Why not EXPORT_SYMBOL_GPL()?
>
> /*
> * acpi_map_gic_cpu_interface - parse processor MADT entry
> diff --git a/drivers/cpufreq/cppc_cpufreq.c
> b/drivers/cpufreq/cppc_cpufreq.c
> index 8f950fe72765..a6cd95c3b474 100644
> --- a/drivers/cpufreq/cppc_cpufreq.c
> +++ b/drivers/cpufreq/cppc_cpufreq.c
> @@ -422,12 +422,66 @@ static unsigned int
> cppc_cpufreq_get_transition_delay_us(unsigned int cpu)
> return cppc_get_transition_latency(cpu) / NSEC_PER_USEC;
> }
>
> +static bool efficiency_class_populated;
> +static DEFINE_PER_CPU(unsigned int, efficiency_class);
> +
> +static int populate_efficiency_class(void)
> +{
> + unsigned int min = UINT_MAX, max = 0, class;
> + struct acpi_madt_generic_interrupt *gicc;
> + int cpu;
> +
> + for_each_possible_cpu(cpu) {
> + gicc = acpi_cpu_get_madt_gicc(cpu);
> + if (!gicc)
> + return -ENODEV;
How can that happen if you made it here using ACPI?
> +
> + per_cpu(efficiency_class, cpu) = gicc->efficiency_class;
> + min = min_t(unsigned int, min, gicc->efficiency_class);
> + max = max_t(unsigned int, max, gicc->efficiency_class);
> + }
Why don't you use a temporary bitmap of 256 bits, tracking
the classes that are actually being used?
> +
> + if (min == max) {
This would become (bitmap_weight(used_classes) <= 1). Then from
the same construct you know how many different classes you have.
You also have the min, max, and all the values in between.
> + pr_debug("Efficiency classes are all equal (=%d). "
> + "No EM registered", max);
> + return -EINVAL;
> + }
> +
> + /*
> + * Squeeze efficiency class values on [0:#efficiency_class-1].
> + * Values are per spec in [0:255].
> + */
> + for (class = 0; class < 256; class++) {
> + unsigned int new_min, curr;
> +
> + new_min = UINT_MAX;
> + for_each_possible_cpu(cpu) {
> + curr = per_cpu(efficiency_class, cpu);
> + if (curr == min)
> + per_cpu(efficiency_class, cpu) = class;
> + else if (curr > min)
> + new_min = min(new_min, curr);
> + }
> +
> + if (new_min == UINT_MAX)
> + break;
> + min = new_min;
> + }
I find it really hard to reason about this because you are
dynamically rewriting the values you keep reevaluating.
How about something like this, which I find more readable:
DECLARE_BITMAP(used_classes, 256) = {};
int class, index, cpu;
for_each_possible_cpu(cpu) {
unsigned int ec;
ec = acpi_cpu_get_madt_gicc(cpu)->efficiency_class & 0xff;
bitmap_set(ec, &used_classes);
}
if (bitmap_weight(&used_classes, 256) <= 1)
return;
index = 0;
for_each_set_bit(class, &used_classes, 256) {
for_each_possible_cpu(cpu) {
if (acpi_cpu_get_madt_gicc(cpu)->efficiency_class == class)
per_cpu(efficiency_class, cpu) = index;
}
index++;
}
Thanks,
M.
--
Jazz is not dead. It just smells funny...
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH v1 2/3] cpufreq: CPPC: Add per_cpu efficiency_class
2022-03-17 15:13 ` Marc Zyngier
@ 2022-03-17 16:07 ` Pierre Gondois
2022-03-17 16:31 ` Marc Zyngier
0 siblings, 1 reply; 10+ messages in thread
From: Pierre Gondois @ 2022-03-17 16:07 UTC (permalink / raw)
To: Marc Zyngier
Cc: linux-kernel, Ionela.Voinescu, Lukasz.Luba, Morten.Rasmussen,
Dietmar.Eggemann, mka, daniel.lezcano, Catalin Marinas,
Will Deacon, Rafael J. Wysocki, Viresh Kumar, Mark Rutland,
Ard Biesheuvel, Fuad Tabba, Rob Herring, linux-arm-kernel,
linux-pm
On 3/17/22 16:13, Marc Zyngier wrote:
> On 2022-03-17 13:34, Pierre Gondois wrote:
>> In ACPI, describing power efficiency of CPUs can be done through the
>> following arm specific field:
>> ACPI 6.4, s5.2.12.14 'GIC CPU Interface (GICC) Structure',
>> 'Processor Power Efficiency Class field':
>> Describes the relative power efficiency of the associated pro-
>> cessor. Lower efficiency class numbers are more efficient than
>> higher ones (e.g. efficiency class 0 should be treated as more
>> efficient than efficiency class 1). However, absolute values
>> of this number have no meaning: 2 isn’t necessarily half as
>> efficient as 1.
>>
>> The efficiency_class field is stored in the GicC structure of the
>> ACPI MADT table and it's currently supported in Linux for arm64 only.
>> Thus, this new functionality is introduced for arm64 only.
>>
>> To allow the cppc_cpufreq driver to know and preprocess the
>> efficiency_class values of all the CPUs, add a per_cpu efficiency_class
>> variable to store them. Also add a static efficiency_class_populated
>> to let the driver know efficiency_class values are usable and register
>> an artificial Energy Model (EM) based on normalized class values.
>>
>> At least 2 different efficiency classes must be present,
>> otherwise there is no use in creating an Energy Model.
>>
>> The efficiency_class values are squeezed in [0:#efficiency_class-1]
>> while conserving the order. For instance, efficiency classes of:
>> [111, 212, 250]
>> will be mapped to:
>> [0 (was 111), 1 (was 212), 2 (was 250)].
>>
>> Each policy being independently registered in the driver, populating
>> the per_cpu efficiency_class is done only once at the driver
>> initialization. This prevents from having each policy re-searching the
>> efficiency_class values of other CPUs.
>>
>> The patch also exports acpi_cpu_get_madt_gicc() to fetch the GicC
>> structure of the ACPI MADT table for each CPU.
>>
>> Signed-off-by: Pierre Gondois <Pierre.Gondois@arm.com>
>> ---
>> arch/arm64/kernel/smp.c | 1 +
>> drivers/cpufreq/cppc_cpufreq.c | 55 ++++++++++++++++++++++++++++++++++
>> 2 files changed, 56 insertions(+)
>>
>> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
>> index 27df5c1e6baa..56637cbea5d6 100644
>> --- a/arch/arm64/kernel/smp.c
>> +++ b/arch/arm64/kernel/smp.c
>> @@ -512,6 +512,7 @@ struct acpi_madt_generic_interrupt
>> *acpi_cpu_get_madt_gicc(int cpu)
>> {
>> return &cpu_madt_gicc[cpu];
>> }
>> +EXPORT_SYMBOL(acpi_cpu_get_madt_gicc);
>
> Why not EXPORT_SYMBOL_GPL()?
From what I understand, this could be made EXPORT_SYMBOL_GPL().
The only reason was that the other symbol exportation in the
file wasn't restricted to GPL.
>
>>
>> /*
>> * acpi_map_gic_cpu_interface - parse processor MADT entry
>> diff --git a/drivers/cpufreq/cppc_cpufreq.c
>> b/drivers/cpufreq/cppc_cpufreq.c
>> index 8f950fe72765..a6cd95c3b474 100644
>> --- a/drivers/cpufreq/cppc_cpufreq.c
>> +++ b/drivers/cpufreq/cppc_cpufreq.c
>> @@ -422,12 +422,66 @@ static unsigned int
>> cppc_cpufreq_get_transition_delay_us(unsigned int cpu)
>> return cppc_get_transition_latency(cpu) / NSEC_PER_USEC;
>> }
>>
>> +static bool efficiency_class_populated;
>> +static DEFINE_PER_CPU(unsigned int, efficiency_class);
>> +
>> +static int populate_efficiency_class(void)
>> +{
>> + unsigned int min = UINT_MAX, max = 0, class;
>> + struct acpi_madt_generic_interrupt *gicc;
>> + int cpu;
>> +
>> + for_each_possible_cpu(cpu) {
>> + gicc = acpi_cpu_get_madt_gicc(cpu);
>> + if (!gicc)
>> + return -ENODEV;
>
> How can that happen if you made it here using ACPI?
This is effectively an extra check. This could be removed.
>
>> +
>> + per_cpu(efficiency_class, cpu) = gicc->efficiency_class;
>> + min = min_t(unsigned int, min, gicc->efficiency_class);
>> + max = max_t(unsigned int, max, gicc->efficiency_class);
>> + }
>
> Why don't you use a temporary bitmap of 256 bits, tracking
> the classes that are actually being used?
>
>> +
>> + if (min == max) {
>
> This would become (bitmap_weight(used_classes) <= 1). Then from
> the same construct you know how many different classes you have.
> You also have the min, max, and all the values in between.
>
>> + pr_debug("Efficiency classes are all equal (=%d). "
>> + "No EM registered", max);
>> + return -EINVAL;
>> + }
>> +
>> + /*
>> + * Squeeze efficiency class values on [0:#efficiency_class-1].
>> + * Values are per spec in [0:255].
>> + */
>> + for (class = 0; class < 256; class++) {
>> + unsigned int new_min, curr;
>> +
>> + new_min = UINT_MAX;
>> + for_each_possible_cpu(cpu) {
>> + curr = per_cpu(efficiency_class, cpu);
>> + if (curr == min)
>> + per_cpu(efficiency_class, cpu) = class;
>> + else if (curr > min)
>> + new_min = min(new_min, curr);
>> + }
>> +
>> + if (new_min == UINT_MAX)
>> + break;
>> + min = new_min;
>> + }
>
> I find it really hard to reason about this because you are
> dynamically rewriting the values you keep reevaluating.
>
> How about something like this, which I find more readable:
>
> DECLARE_BITMAP(used_classes, 256) = {};
> int class, index, cpu;
>
> for_each_possible_cpu(cpu) {
> unsigned int ec;
>
> ec = acpi_cpu_get_madt_gicc(cpu)->efficiency_class & 0xff;
> bitmap_set(ec, &used_classes);
> }
>
> if (bitmap_weight(&used_classes, 256) <= 1)
> return;
>
> index = 0;
>
> for_each_set_bit(class, &used_classes, 256) {
> for_each_possible_cpu(cpu) {
> if (acpi_cpu_get_madt_gicc(cpu)->efficiency_class == class)
> per_cpu(efficiency_class, cpu) = index;
> }
>
> index++;
> }
This is effectively much more readable. Thanks for the code snippet.
Regards,
Pierre
>
>
> Thanks,
>
> M.
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH v1 2/3] cpufreq: CPPC: Add per_cpu efficiency_class
2022-03-17 16:07 ` Pierre Gondois
@ 2022-03-17 16:31 ` Marc Zyngier
0 siblings, 0 replies; 10+ messages in thread
From: Marc Zyngier @ 2022-03-17 16:31 UTC (permalink / raw)
To: Pierre Gondois
Cc: linux-kernel, Ionela.Voinescu, Lukasz.Luba, Morten.Rasmussen,
Dietmar.Eggemann, mka, daniel.lezcano, Catalin Marinas,
Will Deacon, Rafael J. Wysocki, Viresh Kumar, Mark Rutland,
Ard Biesheuvel, Fuad Tabba, Rob Herring, linux-arm-kernel,
linux-pm
On Thu, 17 Mar 2022 16:07:01 +0000,
Pierre Gondois <pierre.gondois@arm.com> wrote:
>
> >> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> >> index 27df5c1e6baa..56637cbea5d6 100644
> >> --- a/arch/arm64/kernel/smp.c
> >> +++ b/arch/arm64/kernel/smp.c
> >> @@ -512,6 +512,7 @@ struct acpi_madt_generic_interrupt
> >> *acpi_cpu_get_madt_gicc(int cpu)
> >> {
> >> return &cpu_madt_gicc[cpu];
> >> }
> >> +EXPORT_SYMBOL(acpi_cpu_get_madt_gicc);
> >
> > Why not EXPORT_SYMBOL_GPL()?
>
> From what I understand, this could be made EXPORT_SYMBOL_GPL().
> The only reason was that the other symbol exportation in the
> file wasn't restricted to GPL.
I'm personally keen on keeping this for GPL code only, just like the
current code is. If there is a further need to relax this, we can
discuss it separately.
>
> >
> >>
> >> /*
> >> * acpi_map_gic_cpu_interface - parse processor MADT entry
> >> diff --git a/drivers/cpufreq/cppc_cpufreq.c
> >> b/drivers/cpufreq/cppc_cpufreq.c
> >> index 8f950fe72765..a6cd95c3b474 100644
> >> --- a/drivers/cpufreq/cppc_cpufreq.c
> >> +++ b/drivers/cpufreq/cppc_cpufreq.c
> >> @@ -422,12 +422,66 @@ static unsigned int
> >> cppc_cpufreq_get_transition_delay_us(unsigned int cpu)
> >> return cppc_get_transition_latency(cpu) / NSEC_PER_USEC;
> >> }
> >>
> >> +static bool efficiency_class_populated;
> >> +static DEFINE_PER_CPU(unsigned int, efficiency_class);
> >> +
> >> +static int populate_efficiency_class(void)
> >> +{
> >> + unsigned int min = UINT_MAX, max = 0, class;
> >> + struct acpi_madt_generic_interrupt *gicc;
> >> + int cpu;
> >> +
> >> + for_each_possible_cpu(cpu) {
> >> + gicc = acpi_cpu_get_madt_gicc(cpu);
> >> + if (!gicc)
> >> + return -ENODEV;
> >
> > How can that happen if you made it here using ACPI?
>
> This is effectively an extra check. This could be removed.
Please do.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v1 3/3] cpufreq: CPPC: Register EM based on efficiency class information
2022-03-17 13:34 [PATCH v1 0/3] Enable EAS for CPPC/ACPI based systems Pierre Gondois
2022-03-17 13:34 ` [PATCH v1 1/3] cpufreq: CPPC: Add cppc_cpufreq_search_cpu_data Pierre Gondois
2022-03-17 13:34 ` [PATCH v1 2/3] cpufreq: CPPC: Add per_cpu efficiency_class Pierre Gondois
@ 2022-03-17 13:34 ` Pierre Gondois
2 siblings, 0 replies; 10+ messages in thread
From: Pierre Gondois @ 2022-03-17 13:34 UTC (permalink / raw)
To: linux-kernel
Cc: Ionela.Voinescu, Lukasz.Luba, Morten.Rasmussen, Dietmar.Eggemann,
mka, daniel.lezcano, Pierre Gondois, Catalin Marinas, Will Deacon,
Rafael J. Wysocki, Viresh Kumar, Mark Rutland, Ard Biesheuvel,
Fuad Tabba, Lee Jones, Valentin Schneider, Hector Martin,
Rob Herring, linux-arm-kernel, linux-pm
Performance states and energy consumption values are not advertised
in ACPI. In the GicC structure of the MADT table, the "Processor
Power Efficiency Class field" (called efficiency class from now)
allows to describe the relative energy efficiency of CPUs.
To leverage the EM and EAS, the CPPC driver creates a set of
artificial performance states and registers them in the Energy Model
(EM), such as:
- Every 20 capacity unit, a performance state is created.
- The energy cost of each performance state gradually increases.
No power value is generated as only the cost is used in the EM.
During task placement, a task can raise the frequency of its whole
pd. This can make EAS place a task on a pd with CPUs that are
individually less energy efficient.
As cost values are artificial, and to place tasks on CPUs with the
lower efficiency class, a gap in cost values is generated for adjacent
efficiency classes.
E.g.:
- efficiency class = 0, capacity is in [0-1024], so cost values
are in [0: 51] (one performance state every 20 capacity unit)
- efficiency class = 1, capacity is in [0-1024], cost values
are in [1*gap+0: 1*gap+51].
The value of the cost gap is chosen to absorb a the energy of 4 CPUs
at their maximum capacity. This means that between:
1- a pd of 4 CPUs, each of them being used at almost their full
capacity. Their efficiency class is N.
2- a CPU using almost none of its capacity. Its efficiency class is
N+1
EAS will choose the first option.
Signed-off-by: Pierre Gondois <Pierre.Gondois@arm.com>
---
drivers/cpufreq/cppc_cpufreq.c | 142 +++++++++++++++++++++++++++++++++
1 file changed, 142 insertions(+)
diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
index a6cd95c3b474..b65586511bc3 100644
--- a/drivers/cpufreq/cppc_cpufreq.c
+++ b/drivers/cpufreq/cppc_cpufreq.c
@@ -425,6 +425,129 @@ static unsigned int cppc_cpufreq_get_transition_delay_us(unsigned int cpu)
static bool efficiency_class_populated;
static DEFINE_PER_CPU(unsigned int, efficiency_class);
+/* Create an artificial performance state every CPPC_EM_CAP_STEP capacity unit. */
+#define CPPC_EM_CAP_STEP (20)
+/* Increase the cost value by CPPC_EM_COST_STEP every performance state. */
+#define CPPC_EM_COST_STEP (1)
+/* Add a cost gap correspnding to the energy of 4 CPUs. */
+#define CPPC_EM_COST_GAP (4 * SCHED_CAPACITY_SCALE * CPPC_EM_COST_STEP \
+ / CPPC_EM_CAP_STEP)
+
+static unsigned int get_perf_level_count(struct cpufreq_policy *policy)
+{
+ struct cppc_perf_caps *perf_caps;
+ unsigned int min_cap, max_cap;
+ struct cppc_cpudata *cpu_data;
+ int cpu = policy->cpu;
+
+ cpu_data = cppc_cpufreq_search_cpu_data(cpu);
+ perf_caps = &cpu_data->perf_caps;
+ max_cap = arch_scale_cpu_capacity(cpu);
+ min_cap = div_u64(max_cap * perf_caps->lowest_perf, perf_caps->highest_perf);
+ if ((min_cap == 0) || (max_cap < min_cap))
+ return 0;
+ return 1 + max_cap / CPPC_EM_CAP_STEP - min_cap / CPPC_EM_CAP_STEP;
+}
+
+/*
+ * The cost is defined as:
+ * cost = power * max_frequency / frequency
+ */
+static inline unsigned long compute_cost(int cpu, int step)
+{
+ return CPPC_EM_COST_GAP * per_cpu(efficiency_class, cpu) +
+ step * CPPC_EM_COST_STEP;
+}
+
+static int cppc_get_cpu_power(struct device *cpu_dev,
+ unsigned long *power, unsigned long *KHz)
+{
+ unsigned long perf_step, perf_prev, perf, perf_check;
+ unsigned int min_step, max_step, step, step_check;
+ unsigned long prev_freq = *KHz;
+ unsigned int min_cap, max_cap;
+
+ struct cppc_perf_caps *perf_caps;
+ struct cppc_cpudata *cpu_data;
+
+ cpu_data = cppc_cpufreq_search_cpu_data(cpu_dev->id);
+ perf_caps = &cpu_data->perf_caps;
+ max_cap = arch_scale_cpu_capacity(cpu_dev->id);
+ min_cap = div_u64(max_cap * perf_caps->lowest_perf,
+ perf_caps->highest_perf);
+
+ perf_step = CPPC_EM_CAP_STEP * perf_caps->highest_perf / max_cap;
+ min_step = min_cap / CPPC_EM_CAP_STEP;
+ max_step = max_cap / CPPC_EM_CAP_STEP;
+
+ perf_prev = cppc_cpufreq_khz_to_perf(cpu_data, *KHz);
+ step = perf_prev / perf_step;
+
+ if (step > max_step)
+ return -EINVAL;
+
+ if (min_step == max_step) {
+ step = max_step;
+ perf = perf_caps->highest_perf;
+ } else if (step < min_step) {
+ step = min_step;
+ perf = perf_caps->lowest_perf;
+ } else {
+ step++;
+ if (step == max_step)
+ perf = perf_caps->highest_perf;
+ else
+ perf = step * perf_step;
+ }
+
+ *KHz = cppc_cpufreq_perf_to_khz(cpu_data, perf);
+ perf_check = cppc_cpufreq_khz_to_perf(cpu_data, *KHz);
+ step_check = perf_check / perf_step;
+
+ /*
+ * To avoid bad integer approximation, check that new frequency value
+ * increased and that the new frequency will be converted to the
+ * desired step value.
+ */
+ while ((*KHz == prev_freq) || (step_check != step)) {
+ perf++;
+ *KHz = cppc_cpufreq_perf_to_khz(cpu_data, perf);
+ perf_check = cppc_cpufreq_khz_to_perf(cpu_data, *KHz);
+ step_check = perf_check / perf_step;
+ }
+
+ /*
+ * With an artificial EM, only the cost value is used. Still the power
+ * is populated such as 0 < power < EM_MAX_POWER. This allows to add
+ * more sense to the artificial performance states.
+ */
+ *power = compute_cost(cpu_dev->id, step);
+
+ return 0;
+}
+
+static int cppc_get_cpu_cost(struct device *cpu_dev, unsigned long KHz,
+ unsigned long *cost)
+{
+ unsigned long perf_step, perf_prev;
+ struct cppc_perf_caps *perf_caps;
+ struct cppc_cpudata *cpu_data;
+ unsigned int max_cap;
+ int step;
+
+ cpu_data = cppc_cpufreq_search_cpu_data(cpu_dev->id);
+ perf_caps = &cpu_data->perf_caps;
+ max_cap = arch_scale_cpu_capacity(cpu_dev->id);
+
+ perf_prev = cppc_cpufreq_khz_to_perf(cpu_data, KHz);
+ perf_step = CPPC_EM_CAP_STEP * perf_caps->highest_perf / max_cap;
+ step = perf_prev / perf_step;
+
+ *cost = compute_cost(cpu_dev->id, step);
+
+ return 0;
+}
+
static int populate_efficiency_class(void)
{
unsigned int min = UINT_MAX, max = 0, class;
@@ -472,6 +595,21 @@ static int populate_efficiency_class(void)
return 0;
}
+static void cppc_cpufreq_register_em(struct cpufreq_policy *policy)
+{
+ struct cppc_cpudata *cpu_data;
+ struct em_data_callback em_cb =
+ EM_ADV_DATA_CB(cppc_get_cpu_power, cppc_get_cpu_cost);
+
+ if (!efficiency_class_populated)
+ return;
+
+ cpu_data = cppc_cpufreq_search_cpu_data(policy->cpu);
+ em_dev_register_perf_domain(get_cpu_device(policy->cpu),
+ get_perf_level_count(policy), &em_cb,
+ cpu_data->shared_cpu_map, 0);
+}
+
#else
static unsigned int cppc_cpufreq_get_transition_delay_us(unsigned int cpu)
@@ -482,6 +620,9 @@ static int populate_efficiency_class(void)
{
return 0;
}
+static void cppc_cpufreq_register_em(struct cpufreq_policy *policy)
+{
+}
#endif
@@ -753,6 +894,7 @@ static struct cpufreq_driver cppc_cpufreq_driver = {
.init = cppc_cpufreq_cpu_init,
.exit = cppc_cpufreq_cpu_exit,
.set_boost = cppc_cpufreq_set_boost,
+ .register_em = cppc_cpufreq_register_em,
.attr = cppc_cpufreq_attr,
.name = "cppc_cpufreq",
};
--
2.25.1
^ permalink raw reply related [flat|nested] 10+ messages in thread