* [PATCH v1 0/3] cpufreq: intel_pstate: Simplify the energy model for hybrid systems
@ 2025-10-08 18:54 Rafael J. Wysocki
2025-10-08 18:55 ` [PATCH v1 1/3] cpufreq: intel_pstate: Add and use hybrid_get_cpu_type() Rafael J. Wysocki
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Rafael J. Wysocki @ 2025-10-08 18:54 UTC (permalink / raw)
To: Linux PM
Cc: LKML, Lukasz Luba, Srinivas Pandruvada, Dietmar Eggemann,
Christian Loehle
Hi Everyone,
I've realized recently that the (artificial) energy model used by intel_pstate
on hybrid platforms without SMT is much more complicated than really necessary,
so I've decided to simplify it.
The new energy model uses less memory and it introduces less overhead into the
scheduler (mostly due to the reduction of the states table size). This change
in made in patch [3/3] (please see the changelog for the rationale).
The first two patches in the series are preliminary.
Thanks!
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v1 1/3] cpufreq: intel_pstate: Add and use hybrid_get_cpu_type()
2025-10-08 18:54 [PATCH v1 0/3] cpufreq: intel_pstate: Simplify the energy model for hybrid systems Rafael J. Wysocki
@ 2025-10-08 18:55 ` Rafael J. Wysocki
2025-10-08 18:56 ` [PATCH v1 2/3] cpufreq: intel_pstate: Add and use hybrid_has_l3() Rafael J. Wysocki
2025-10-08 19:22 ` [PATCH v1 3/3] cpufreq: intel_pstate: Simplify the energy model for hybrid systems Rafael J. Wysocki
2 siblings, 0 replies; 6+ messages in thread
From: Rafael J. Wysocki @ 2025-10-08 18:55 UTC (permalink / raw)
To: Linux PM
Cc: LKML, Lukasz Luba, Srinivas Pandruvada, Dietmar Eggemann,
Christian Loehle
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Introduce a function for identifying the type of a given CPU in a
hybrid system, called hybrid_get_cpu_type(), and use if for hybrid
scaling factor determination in hwp_get_cpu_scaling().
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
drivers/cpufreq/intel_pstate.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -912,6 +912,11 @@ static struct freq_attr *hwp_cpufreq_att
[HWP_CPUFREQ_ATTR_COUNT] = NULL,
};
+static u8 hybrid_get_cpu_type(unsigned int cpu)
+{
+ return cpu_data(cpu).topo.intel_type;
+}
+
static bool no_cas __ro_after_init;
static struct cpudata *hybrid_max_perf_cpu __read_mostly;
@@ -2298,18 +2303,14 @@ static int knl_get_turbo_pstate(int cpu)
static int hwp_get_cpu_scaling(int cpu)
{
if (hybrid_scaling_factor) {
- struct cpuinfo_x86 *c = &cpu_data(cpu);
- u8 cpu_type = c->topo.intel_type;
-
/*
* Return the hybrid scaling factor for P-cores and use the
* default core scaling for E-cores.
*/
- if (cpu_type == INTEL_CPU_TYPE_CORE)
+ if (hybrid_get_cpu_type(cpu) == INTEL_CPU_TYPE_CORE)
return hybrid_scaling_factor;
- if (cpu_type == INTEL_CPU_TYPE_ATOM)
- return core_get_scaling();
+ return core_get_scaling();
}
/* Use core scaling on non-hybrid systems. */
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v1 2/3] cpufreq: intel_pstate: Add and use hybrid_has_l3()
2025-10-08 18:54 [PATCH v1 0/3] cpufreq: intel_pstate: Simplify the energy model for hybrid systems Rafael J. Wysocki
2025-10-08 18:55 ` [PATCH v1 1/3] cpufreq: intel_pstate: Add and use hybrid_get_cpu_type() Rafael J. Wysocki
@ 2025-10-08 18:56 ` Rafael J. Wysocki
2025-10-08 19:22 ` [PATCH v1 3/3] cpufreq: intel_pstate: Simplify the energy model for hybrid systems Rafael J. Wysocki
2 siblings, 0 replies; 6+ messages in thread
From: Rafael J. Wysocki @ 2025-10-08 18:56 UTC (permalink / raw)
To: Linux PM
Cc: LKML, Lukasz Luba, Srinivas Pandruvada, Dietmar Eggemann,
Christian Loehle
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Introduce a function for checking whether or not a given CPU has L3
cache, called hybrid_has_l3(), and use it in hybrid_get_cost() for
computing cost coefficients associated with a given perf domain.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
drivers/cpufreq/intel_pstate.c | 30 ++++++++++++++++++------------
1 file changed, 18 insertions(+), 12 deletions(-)
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -951,11 +951,26 @@ static int hybrid_active_power(struct de
return 0;
}
+static bool hybrid_has_l3(unsigned int cpu)
+{
+ struct cpu_cacheinfo *cacheinfo = get_cpu_cacheinfo(cpu);
+ unsigned int i;
+
+ if (!cacheinfo)
+ return false;
+
+ for (i = 0; i < cacheinfo->num_leaves; i++) {
+ if (cacheinfo->info_list[i].level == 3)
+ return true;
+ }
+
+ return false;
+}
+
static int hybrid_get_cost(struct device *dev, unsigned long freq,
unsigned long *cost)
{
struct pstate_data *pstate = &all_cpu_data[dev->id]->pstate;
- struct cpu_cacheinfo *cacheinfo = get_cpu_cacheinfo(dev->id);
/*
* The smaller the perf-to-frequency scaling factor, the larger the IPC
@@ -973,17 +988,8 @@ static int hybrid_get_cost(struct device
* touching it in case some other CPUs of the same type can do the work
* without it.
*/
- if (cacheinfo) {
- unsigned int i;
-
- /* Check if L3 cache is there. */
- for (i = 0; i < cacheinfo->num_leaves; i++) {
- if (cacheinfo->info_list[i].level == 3) {
- *cost += 2;
- break;
- }
- }
- }
+ if (hybrid_has_l3(dev->id))
+ *cost += 2;
return 0;
}
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v1 3/3] cpufreq: intel_pstate: Simplify the energy model for hybrid systems
2025-10-08 18:54 [PATCH v1 0/3] cpufreq: intel_pstate: Simplify the energy model for hybrid systems Rafael J. Wysocki
2025-10-08 18:55 ` [PATCH v1 1/3] cpufreq: intel_pstate: Add and use hybrid_get_cpu_type() Rafael J. Wysocki
2025-10-08 18:56 ` [PATCH v1 2/3] cpufreq: intel_pstate: Add and use hybrid_has_l3() Rafael J. Wysocki
@ 2025-10-08 19:22 ` Rafael J. Wysocki
2025-10-11 8:50 ` Yaxiong Tian
2 siblings, 1 reply; 6+ messages in thread
From: Rafael J. Wysocki @ 2025-10-08 19:22 UTC (permalink / raw)
To: Linux PM
Cc: LKML, Lukasz Luba, Srinivas Pandruvada, Dietmar Eggemann,
Christian Loehle
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Since em_cpu_energy() computes the cost of using a given CPU to
do work as a product of the utilization of that CPU and a constant
positive cost coefficient supplied through an energy model, EAS
evenly distributes the load among CPUs represented by identical
one-CPU PDs regardless of what is there in the energy model.
Namely, two CPUs represented by identical PDs have the same energy
model data and if the PDs are one-CPU, max_util is always equal to the
utilization of the given CPU, possibly increased by the utilization
of a task that is waking up. The cost coefficient is a monotonically
increasing (or at least non-decreasing) function of max_util, so the
CPU with higher utilization will generally get a higher (or at least
not lower) cost coefficient. After multiplying that coefficient by
CPU utilization, the resulting number will always be higher for the
CPU with higher utilization. Accordingly, whenever these two CPUs
are compared, the cost of running a waking task will always be higher
for the CPU with higher utilization which leads to the even distribution
of load mentioned above.
For this reason, the energy model can be adjusted in arbitrary
ways without disturbing the even distribution of load among CPUs
represented by indentical one-CPU PDs. In particular, for all of
those CPUs, the energy model can provide one cost coefficient that
does not depend on the performance level.
Moreover, if there are two different CPU types, A and B, each having
a performance-independent cost coefficient in the EM, then these
cost coefficients determine the utilization levels at which CPUs
of type A and B will be regarded as equally expensive for running
a waking task. For example, if the cost coefficient for CPU type
A is 1, the cost coefficient for CPU type B is 2, and the utilization
of the waking task is x, a CPU of type A will be regarded as "cost-
equivalent" to a CPU of type B if its utilization is the sum of x and
twice the utilization of the latter. Similarly, for the cost
coefficients equal to 2 and 3, respectively, the "cost equivalence"
utilization of CPU type A will be the sum of x/2 and the CPU type B
utilization multiplied by 3/2. In the limit of negligibly small x,
the "cost equivalence" utilization of CPU type A is just the
utilization of CPU type B multiplied by the ratio of the cost
coefficients for B and A. That ratio can be regarded as an effective
"cost priority" of CPU type A relative to CPU type B, as it indicates
how much more on average the former needs to be loaded so it can be
regarded as cost-equivalent to the latter (for low-utilization tasks).
Use the above observations for simplifying the default energy model
for hybrid platforms in intel_pstate as follows:
* A performance-independent cost coefficient is introduced for each CPU
type.
* The number of states in each PD is reduced to 2 (it is not necessary
to use more of them because the cost per scale-invariant utilization
point does not depend on the performance level any more).
* CPUs without L3 cache (LPE-cores) that are expected to be the most
energy-efficient ones are prioritized over any other CPUs.
* The CPU type value from CPUID (now easliy accessible through
cpu_data[]) is used for identifying P-cores and E-cores instead
of hybrid scaling factors which are less reliable.
* E-cores are preferred to P-cores.
The cost coefficients for different CPU types that can appear in a
hybrid system (P-cores, E-cores, and LPE-cores that are effectively
E-cores without L3 cache and with lower capacity) are chosen in
accordance with the following rules:
* The cost priority of LPE-cores relative to E-cores is 1.5.
* The cost priority of E-cores relative to P-cores is 2, which
also means that the cost priority of LPE-cores relative to
P-cores is 3.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
drivers/cpufreq/intel_pstate.c | 41 +++++++++++++++--------------------------
1 file changed, 15 insertions(+), 26 deletions(-)
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -927,23 +927,20 @@ static struct cpudata *hybrid_max_perf_c
static DEFINE_MUTEX(hybrid_capacity_lock);
#ifdef CONFIG_ENERGY_MODEL
-#define HYBRID_EM_STATE_COUNT 4
+#define HYBRID_EM_STATE_COUNT 2
static int hybrid_active_power(struct device *dev, unsigned long *power,
unsigned long *freq)
{
/*
- * Create "utilization bins" of 0-40%, 40%-60%, 60%-80%, and 80%-100%
- * of the maximum capacity such that two CPUs of the same type will be
- * regarded as equally attractive if the utilization of each of them
- * falls into the same bin, which should prevent tasks from being
- * migrated between them too often.
+ * Create two "states" corresponding to 50% and 100% of the full
+ * capacity.
*
- * For this purpose, return the "frequency" of 2 for the first
+ * For this purpose, return the "frequency" of 1 for the first
* performance level and otherwise leave the value set by the caller.
*/
if (!*freq)
- *freq = 2;
+ *freq = 1;
/* No power information. */
*power = EM_MAX_POWER;
@@ -970,26 +967,18 @@ static bool hybrid_has_l3(unsigned int c
static int hybrid_get_cost(struct device *dev, unsigned long freq,
unsigned long *cost)
{
- struct pstate_data *pstate = &all_cpu_data[dev->id]->pstate;
-
- /*
- * The smaller the perf-to-frequency scaling factor, the larger the IPC
- * ratio between the given CPU and the least capable CPU in the system.
- * Regard that IPC ratio as the primary cost component and assume that
- * the scaling factors for different CPU types will differ by at least
- * 5% and they will not be above INTEL_PSTATE_CORE_SCALING.
- *
- * Add the freq value to the cost, so that the cost of running on CPUs
- * of the same type in different "utilization bins" is different.
- */
- *cost = div_u64(100ULL * INTEL_PSTATE_CORE_SCALING, pstate->scaling) + freq;
/*
- * Increase the cost slightly for CPUs able to access L3 to avoid
- * touching it in case some other CPUs of the same type can do the work
- * without it.
+ * The cost per scale-invariant utilization point for LPE-cores (CPUs
+ * without L3 cache), E-cores and P-cores is chosen so that the cost
+ * priority of LPE-cores relative to E-cores is 1.5 and the cost
+ * priority of E-cores relative to P-cores is 2.
*/
- if (hybrid_has_l3(dev->id))
- *cost += 2;
+ if (!hybrid_has_l3(dev->id))
+ *cost = 2;
+ else if (hybrid_get_cpu_type(dev->id) == INTEL_CPU_TYPE_ATOM)
+ *cost = 3;
+ else
+ *cost = 6;
return 0;
}
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re:[PATCH v1 3/3] cpufreq: intel_pstate: Simplify the energy model for hybrid systems
2025-10-08 19:22 ` [PATCH v1 3/3] cpufreq: intel_pstate: Simplify the energy model for hybrid systems Rafael J. Wysocki
@ 2025-10-11 8:50 ` Yaxiong Tian
2025-10-11 12:29 ` [PATCH " Rafael J. Wysocki
0 siblings, 1 reply; 6+ messages in thread
From: Yaxiong Tian @ 2025-10-11 8:50 UTC (permalink / raw)
To: rafael
Cc: christian.loehle, dietmar.eggemann, linux-kernel, linux-pm,
lukasz.luba, srinivas.pandruvada
> Since em_cpu_energy() computes the cost of using a given CPU to
> do work as a product of the utilization of that CPU and a constant
> positive cost coefficient supplied through an energy model, EAS
> evenly distributes the load among CPUs represented by identical
> one-CPU PDs regardless of what is there in the energy model.
>
> Namely, two CPUs represented by identical PDs have the same energy
> model data and if the PDs are one-CPU, max_util is always equal to the
> utilization of the given CPU, possibly increased by the utilization
> of a task that is waking up. The cost coefficient is a monotonically
> increasing (or at least non-decreasing) function of max_util, so the
> CPU with higher utilization will generally get a higher (or at least
> not lower) cost coefficient. After multiplying that coefficient by
> CPU utilization, the resulting number will always be higher for the
> CPU with higher utilization. Accordingly, whenever these two CPUs
> are compared, the cost of running a waking task will always be higher
> for the CPU with higher utilization which leads to the even distribution
> of load mentioned above.
>
> For this reason, the energy model can be adjusted in arbitrary
> ways without disturbing the even distribution of load among CPUs
> represented by indentical one-CPU PDs. In particular, for all of
> those CPUs, the energy model can provide one cost coefficient that
> does not depend on the performance level.
But if the cost is a constant that does not depend on performance levels,
then the energy increment for running a waking task on these CPUs would be
the same. For example, for a task with utilization u, whether it is placed
on CPU A or CPU B, since the cost is the same, the energy increment generated
would be identical. In this case, EAS should not perform load balancing
between them.
>
> Moreover, if there are two different CPU types, A and B, each having
> a performance-independent cost coefficient in the EM, then these
> cost coefficients determine the utilization levels at which CPUs
> of type A and B will be regarded as equally expensive for running
> a waking task. For example, if the cost coefficient for CPU type
> A is 1, the cost coefficient for CPU type B is 2, and the utilization
> of the waking task is x, a CPU of type A will be regarded as "cost-
> equivalent" to a CPU of type B if its utilization is the sum of x and
> twice the utilization of the latter. Similarly, for the cost
> coefficients equal to 2 and 3, respectively, the "cost equivalence"
> utilization of CPU type A will be the sum of x/2 and the CPU type B
> utilization multiplied by 3/2. In the limit of negligibly small x,
> the "cost equivalence" utilization of CPU type A is just the
> utilization of CPU type B multiplied by the ratio of the cost
> coefficients for B and A. That ratio can be regarded as an effective
> "cost priority" of CPU type A relative to CPU type B, as it indicates
> how much more on average the former needs to be loaded so it can be
> regarded as cost-equivalent to the latter (for low-utilization tasks).
>
> Use the above observations for simplifying the default energy model
> for hybrid platforms in intel_pstate as follows:
>
> * A performance-independent cost coefficient is introduced for each CPU
> type.
>
> * The number of states in each PD is reduced to 2 (it is not necessary
> to use more of them because the cost per scale-invariant utilization
> point does not depend on the performance level any more).
>
> * CPUs without L3 cache (LPE-cores) that are expected to be the most
> energy-efficient ones are prioritized over any other CPUs.
>
> * The CPU type value from CPUID (now easliy accessible through
> cpu_data[]) is used for identifying P-cores and E-cores instead
> of hybrid scaling factors which are less reliable.
>
> * E-cores are preferred to P-cores.
>
> The cost coefficients for different CPU types that can appear in a
> hybrid system (P-cores, E-cores, and LPE-cores that are effectively
> E-cores without L3 cache and with lower capacity) are chosen in
> accordance with the following rules:
>
> * The cost priority of LPE-cores relative to E-cores is 1.5.
>
> * The cost priority of E-cores relative to P-cores is 2, which
> also means that the cost priority of LPE-cores relative to
> P-cores is 3.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v1 3/3] cpufreq: intel_pstate: Simplify the energy model for hybrid systems
2025-10-11 8:50 ` Yaxiong Tian
@ 2025-10-11 12:29 ` Rafael J. Wysocki
0 siblings, 0 replies; 6+ messages in thread
From: Rafael J. Wysocki @ 2025-10-11 12:29 UTC (permalink / raw)
To: Yaxiong Tian
Cc: rafael, christian.loehle, dietmar.eggemann, linux-kernel,
linux-pm, lukasz.luba, srinivas.pandruvada
On Sat, Oct 11, 2025 at 10:51 AM Yaxiong Tian <tianyaxiong@kylinos.cn> wrote:
>
> > Since em_cpu_energy() computes the cost of using a given CPU to
> > do work as a product of the utilization of that CPU and a constant
> > positive cost coefficient supplied through an energy model, EAS
> > evenly distributes the load among CPUs represented by identical
> > one-CPU PDs regardless of what is there in the energy model.
> >
> > Namely, two CPUs represented by identical PDs have the same energy
> > model data and if the PDs are one-CPU, max_util is always equal to the
> > utilization of the given CPU, possibly increased by the utilization
> > of a task that is waking up. The cost coefficient is a monotonically
> > increasing (or at least non-decreasing) function of max_util, so the
> > CPU with higher utilization will generally get a higher (or at least
> > not lower) cost coefficient. After multiplying that coefficient by
> > CPU utilization, the resulting number will always be higher for the
> > CPU with higher utilization. Accordingly, whenever these two CPUs
> > are compared, the cost of running a waking task will always be higher
> > for the CPU with higher utilization which leads to the even distribution
> > of load mentioned above.
> >
> > For this reason, the energy model can be adjusted in arbitrary
> > ways without disturbing the even distribution of load among CPUs
> > represented by indentical one-CPU PDs. In particular, for all of
> > those CPUs, the energy model can provide one cost coefficient that
> > does not depend on the performance level.
>
> But if the cost is a constant that does not depend on performance levels,
> then the energy increment for running a waking task on these CPUs would be
> the same. For example, for a task with utilization u, whether it is placed
> on CPU A or CPU B, since the cost is the same, the energy increment generated
> would be identical. In this case, EAS should not perform load balancing
> between them.
Right, what matters is the delta between base_energy and energy
including the waking task utilization.
I got confused somehow, sorry for the noise.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-10-11 12:29 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-08 18:54 [PATCH v1 0/3] cpufreq: intel_pstate: Simplify the energy model for hybrid systems Rafael J. Wysocki
2025-10-08 18:55 ` [PATCH v1 1/3] cpufreq: intel_pstate: Add and use hybrid_get_cpu_type() Rafael J. Wysocki
2025-10-08 18:56 ` [PATCH v1 2/3] cpufreq: intel_pstate: Add and use hybrid_has_l3() Rafael J. Wysocki
2025-10-08 19:22 ` [PATCH v1 3/3] cpufreq: intel_pstate: Simplify the energy model for hybrid systems Rafael J. Wysocki
2025-10-11 8:50 ` Yaxiong Tian
2025-10-11 12:29 ` [PATCH " Rafael J. Wysocki
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).