public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC][PATCH v0.1 0/6] cpufreq: intel_pstate: Enable EAS on hybrid platforms without SMT
@ 2024-11-08 16:09 Rafael J. Wysocki
  2024-11-08 16:36 ` [RFC][PATCH v0.1 1/6] PM: EM: Move perf rebuilding function from schedutil to EM Rafael J. Wysocki
                   ` (5 more replies)
  0 siblings, 6 replies; 22+ messages in thread
From: Rafael J. Wysocki @ 2024-11-08 16:09 UTC (permalink / raw)
  To: Linux PM
  Cc: LKML, Lukasz Luba, Peter Zijlstra, Srinivas Pandruvada, Len Brown,
	Dietmar Eggemann, Morten Rasmussen, Vincent Guittot, Ricardo Neri

Hi Everyone,

This series, on top of

https://lore.kernel.org/linux-pm/12554508.O9o76ZdvQC@rjwysocki.net/

modifies the energy model code, the EAS setup code and the intel_pstate
driver to enable simplified EAS support in the latter.

The underlying observation is that on the platforms targeted by these changes,
Lunar Lake at the time of this writing, the "small" CPUs (E-cores), when run at
the same performance level, are always more energy-efficient than the "big" or
"performance" CPUs (P-cores).  This means that, regardless of the scale-
invariant utilization of a task, as long as there is enough spare capacity on
E-cores, the relative cost of running it there is always lower.

Thus the idea is to register a perf domain per CPU type, which currently are
P-cores and E-cores, to represent the relative costs of running tasks on CPUs
of each type.  The states table in each of these perf domains is one-element
and that element only contains the cost value, which causes EAS to compare the
"E-core cost" with the "P-core cost" every time it has to make a decision, and
because the "E-core cost" is lower, it will always prefer E-cores as long as
there is enough spare capacity to run the given task on one of them.

The intel_pstate driver knows the type of each CPU, so it can create cpumasks
requisite for registering the perf domains as per the above, but the energy
model registration code needs to be adjusted to handle perf domains with
one-element states tables (further referred to as stub perf domains).  It
also needs to allow adding a new CPU to an existing perf domain to handle the
case in which some CPUs are offline to start with and are brought online later
via sysfs.  The first 4 patches in the series make the requisite energy model
change.

Patch [5/6] updates the EAS setup code to allow it to work without the
schedutil cpufreq govenor which need not be used when intel_pstate is in
use (in the "active" mode, intel_pstate uses a built-in governor that can
work with EAS just fine because it also adjusts the CPU performance level to
utilization).

The last patch modifies intel_pstate to register the perf domains described
above and update them when new CPUs become available for the first time.

Please refer to the individual patch changelogs for details.

It has been verified that the behavior after the changes here is as intended,
that is the perf domains are registered and EAS is enabled.

For easier access, the series is available on the intel_pstate-experimental-v2
branch in linux-pm.git:

https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/log/?h=intel_pstate-experimental-v2

Thanks!




^ permalink raw reply	[flat|nested] 22+ messages in thread

* [RFC][PATCH v0.1 1/6] PM: EM: Move perf rebuilding function from schedutil to EM
  2024-11-08 16:09 [RFC][PATCH v0.1 0/6] cpufreq: intel_pstate: Enable EAS on hybrid platforms without SMT Rafael J. Wysocki
@ 2024-11-08 16:36 ` Rafael J. Wysocki
  2024-11-08 16:37 ` [RFC][PATCH v0.1 2/6] PM: EM: Call em_compute_costs() from em_create_perf_table() Rafael J. Wysocki
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 22+ messages in thread
From: Rafael J. Wysocki @ 2024-11-08 16:36 UTC (permalink / raw)
  To: Linux PM
  Cc: LKML, Lukasz Luba, Peter Zijlstra, Srinivas Pandruvada, Len Brown,
	Dietmar Eggemann, Morten Rasmussen, Vincent Guittot, Ricardo Neri

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

The sugov_eas_rebuild_sd() function defined in the schedutil cpufreq
governor implements generic functionality that may be useful in other
places.  In particular, going forward it will be used in the intel_pstate
driver.

For this reason, move it from schedutil to the energy model code and
rename it to em_rebuild_perf_domains().

This also involves getting rid of some #ifdeffery in schedutil which
is a plus.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 include/linux/energy_model.h     |    2 ++
 kernel/power/energy_model.c      |   17 +++++++++++++++++
 kernel/sched/cpufreq_schedutil.c |   33 ++++++---------------------------
 3 files changed, 25 insertions(+), 27 deletions(-)

Index: linux-pm/kernel/power/energy_model.c
===================================================================
--- linux-pm.orig/kernel/power/energy_model.c
+++ linux-pm/kernel/power/energy_model.c
@@ -908,3 +908,20 @@ int em_update_performance_limits(struct
 	return 0;
 }
 EXPORT_SYMBOL_GPL(em_update_performance_limits);
+
+static void rebuild_sd_workfn(struct work_struct *work)
+{
+	rebuild_sched_domains_energy();
+}
+
+static DECLARE_WORK(rebuild_sd_work, rebuild_sd_workfn);
+
+void em_rebuild_perf_domains(void)
+{
+	/*
+	 * When called from the cpufreq_register_driver() path, the
+	 * cpu_hotplug_lock is already held, so use a work item to
+	 * avoid nested locking in rebuild_sched_domains().
+	 */
+	schedule_work(&rebuild_sd_work);
+}
Index: linux-pm/kernel/sched/cpufreq_schedutil.c
===================================================================
--- linux-pm.orig/kernel/sched/cpufreq_schedutil.c
+++ linux-pm/kernel/sched/cpufreq_schedutil.c
@@ -604,31 +604,6 @@ static const struct kobj_type sugov_tuna
 
 /********************** cpufreq governor interface *********************/
 
-#ifdef CONFIG_ENERGY_MODEL
-static void rebuild_sd_workfn(struct work_struct *work)
-{
-	rebuild_sched_domains_energy();
-}
-
-static DECLARE_WORK(rebuild_sd_work, rebuild_sd_workfn);
-
-/*
- * EAS shouldn't be attempted without sugov, so rebuild the sched_domains
- * on governor changes to make sure the scheduler knows about it.
- */
-static void sugov_eas_rebuild_sd(void)
-{
-	/*
-	 * When called from the cpufreq_register_driver() path, the
-	 * cpu_hotplug_lock is already held, so use a work item to
-	 * avoid nested locking in rebuild_sched_domains().
-	 */
-	schedule_work(&rebuild_sd_work);
-}
-#else
-static inline void sugov_eas_rebuild_sd(void) { };
-#endif
-
 struct cpufreq_governor schedutil_gov;
 
 static struct sugov_policy *sugov_policy_alloc(struct cpufreq_policy *policy)
@@ -783,7 +758,11 @@ static int sugov_init(struct cpufreq_pol
 	if (ret)
 		goto fail;
 
-	sugov_eas_rebuild_sd();
+	/*
+	 * EAS shouldn't be attempted without sugov, so rebuild the sched_domains
+	 * on governor changes to make sure the scheduler knows about it.
+	 */
+	em_rebuild_perf_domains();
 
 out:
 	mutex_unlock(&global_tunables_lock);
@@ -827,7 +806,7 @@ static void sugov_exit(struct cpufreq_po
 	sugov_policy_free(sg_policy);
 	cpufreq_disable_fast_switch(policy);
 
-	sugov_eas_rebuild_sd();
+	em_rebuild_perf_domains();
 }
 
 static int sugov_start(struct cpufreq_policy *policy)
Index: linux-pm/include/linux/energy_model.h
===================================================================
--- linux-pm.orig/include/linux/energy_model.h
+++ linux-pm/include/linux/energy_model.h
@@ -179,6 +179,7 @@ int em_dev_compute_costs(struct device *
 int em_dev_update_chip_binning(struct device *dev);
 int em_update_performance_limits(struct em_perf_domain *pd,
 		unsigned long freq_min_khz, unsigned long freq_max_khz);
+void em_rebuild_perf_domains(void);
 
 /**
  * em_pd_get_efficient_state() - Get an efficient performance state from the EM
@@ -404,6 +405,7 @@ int em_update_performance_limits(struct
 {
 	return -EINVAL;
 }
+static inline void em_rebuild_perf_domains(void) {}
 #endif
 
 #endif




^ permalink raw reply	[flat|nested] 22+ messages in thread

* [RFC][PATCH v0.1 2/6] PM: EM: Call em_compute_costs() from em_create_perf_table()
  2024-11-08 16:09 [RFC][PATCH v0.1 0/6] cpufreq: intel_pstate: Enable EAS on hybrid platforms without SMT Rafael J. Wysocki
  2024-11-08 16:36 ` [RFC][PATCH v0.1 1/6] PM: EM: Move perf rebuilding function from schedutil to EM Rafael J. Wysocki
@ 2024-11-08 16:37 ` Rafael J. Wysocki
  2024-11-12  8:21   ` Dietmar Eggemann
  2024-11-08 16:38 ` [RFC][PATCH v0.1 3/6] PM: EM: Add special case to em_dev_register_perf_domain() Rafael J. Wysocki
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 22+ messages in thread
From: Rafael J. Wysocki @ 2024-11-08 16:37 UTC (permalink / raw)
  To: Linux PM
  Cc: LKML, Lukasz Luba, Peter Zijlstra, Srinivas Pandruvada, Len Brown,
	Dietmar Eggemann, Morten Rasmussen, Vincent Guittot, Ricardo Neri

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

In preparation for subsequent changes, move the em_compute_costs()
invocation from em_create_perf_table() to em_create_pd() which is its
only caller.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 kernel/power/energy_model.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Index: linux-pm/kernel/power/energy_model.c
===================================================================
--- linux-pm.orig/kernel/power/energy_model.c
+++ linux-pm/kernel/power/energy_model.c
@@ -388,10 +388,6 @@ static int em_create_perf_table(struct d
 
 	em_init_performance(dev, pd, table, nr_states);
 
-	ret = em_compute_costs(dev, table, cb, nr_states, flags);
-	if (ret)
-		return -EINVAL;
-
 	return 0;
 }
 
@@ -434,6 +430,10 @@ static int em_create_pd(struct device *d
 	if (ret)
 		goto free_pd_table;
 
+	ret = em_compute_costs(dev, em_table->state, cb, nr_states, flags);
+	if (ret)
+		goto free_pd_table;
+
 	rcu_assign_pointer(pd->em_table, em_table);
 
 	if (_is_cpu_device(dev))




^ permalink raw reply	[flat|nested] 22+ messages in thread

* [RFC][PATCH v0.1 3/6] PM: EM: Add special case to em_dev_register_perf_domain()
  2024-11-08 16:09 [RFC][PATCH v0.1 0/6] cpufreq: intel_pstate: Enable EAS on hybrid platforms without SMT Rafael J. Wysocki
  2024-11-08 16:36 ` [RFC][PATCH v0.1 1/6] PM: EM: Move perf rebuilding function from schedutil to EM Rafael J. Wysocki
  2024-11-08 16:37 ` [RFC][PATCH v0.1 2/6] PM: EM: Call em_compute_costs() from em_create_perf_table() Rafael J. Wysocki
@ 2024-11-08 16:38 ` Rafael J. Wysocki
  2024-11-12  8:21   ` Dietmar Eggemann
  2024-11-18 15:24   ` Hongyan Xia
  2024-11-08 16:40 ` [RFC][PATCH v0.1 4/6] PM: EM: Introduce em_dev_expand_perf_domain() Rafael J. Wysocki
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 22+ messages in thread
From: Rafael J. Wysocki @ 2024-11-08 16:38 UTC (permalink / raw)
  To: Linux PM
  Cc: LKML, Lukasz Luba, Peter Zijlstra, Srinivas Pandruvada, Len Brown,
	Dietmar Eggemann, Morten Rasmussen, Vincent Guittot, Ricardo Neri

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Allow em_dev_register_perf_domain() to register a cost-only stub
perf domain with one-element states table if the .active_power()
callback is not provided.

Subsequently, this will be used by the intel_pstate driver to register
stub perf domains for CPUs on hybrid systems.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 kernel/power/energy_model.c |   26 +++++++++++++++++++++++---
 1 file changed, 23 insertions(+), 3 deletions(-)

Index: linux-pm/kernel/power/energy_model.c
===================================================================
--- linux-pm.orig/kernel/power/energy_model.c
+++ linux-pm/kernel/power/energy_model.c
@@ -426,9 +426,11 @@ static int em_create_pd(struct device *d
 	if (!em_table)
 		goto free_pd;
 
-	ret = em_create_perf_table(dev, pd, em_table->state, cb, flags);
-	if (ret)
-		goto free_pd_table;
+	if (cb->active_power) {
+		ret = em_create_perf_table(dev, pd, em_table->state, cb, flags);
+		if (ret)
+			goto free_pd_table;
+	}
 
 	ret = em_compute_costs(dev, em_table->state, cb, nr_states, flags);
 	if (ret)
@@ -561,11 +563,20 @@ int em_dev_register_perf_domain(struct d
 {
 	unsigned long cap, prev_cap = 0;
 	unsigned long flags = 0;
+	bool stub_pd = false;
 	int cpu, ret;
 
 	if (!dev || !nr_states || !cb)
 		return -EINVAL;
 
+	if (!cb->active_power) {
+		if (!cb->get_cost || nr_states > 1 || microwatts)
+			return -EINVAL;
+
+		/* Special case: a stub perf domain. */
+		stub_pd = true;
+	}
+
 	/*
 	 * Use a mutex to serialize the registration of performance domains and
 	 * let the driver-defined callback functions sleep.
@@ -590,6 +601,15 @@ int em_dev_register_perf_domain(struct d
 				ret = -EEXIST;
 				goto unlock;
 			}
+
+			/*
+			 * The capacity need not be the same for all CPUs in a
+			 * stub perf domain, so long as the average cost of
+			 * running on each of them is approximately the same.
+			 */
+			if (stub_pd)
+				continue;
+
 			/*
 			 * All CPUs of a domain must have the same
 			 * micro-architecture since they all share the same




^ permalink raw reply	[flat|nested] 22+ messages in thread

* [RFC][PATCH v0.1 4/6] PM: EM: Introduce em_dev_expand_perf_domain()
  2024-11-08 16:09 [RFC][PATCH v0.1 0/6] cpufreq: intel_pstate: Enable EAS on hybrid platforms without SMT Rafael J. Wysocki
                   ` (2 preceding siblings ...)
  2024-11-08 16:38 ` [RFC][PATCH v0.1 3/6] PM: EM: Add special case to em_dev_register_perf_domain() Rafael J. Wysocki
@ 2024-11-08 16:40 ` Rafael J. Wysocki
  2024-11-08 16:41 ` [RFC][PATCH v0.1 5/6] sched/topology: Allow .setpolicy() cpufreq drivers to enable EAS Rafael J. Wysocki
  2024-11-08 16:46 ` [RFC][PATCH v0.1 6/6] cpufreq: intel_pstate: Add basic EAS support on hybrid platforms Rafael J. Wysocki
  5 siblings, 0 replies; 22+ messages in thread
From: Rafael J. Wysocki @ 2024-11-08 16:40 UTC (permalink / raw)
  To: Linux PM
  Cc: LKML, Lukasz Luba, Peter Zijlstra, Srinivas Pandruvada, Len Brown,
	Dietmar Eggemann, Morten Rasmussen, Vincent Guittot, Ricardo Neri

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Introduce a helper function for adding a CPU to an existing EM perf
domain.

Subsequently, this will be used by the intel_pstate driver to add new
CPUs to existing perf domains when those CPUs go online for the first
time after the initialization of the driver.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 include/linux/energy_model.h |    5 +++++
 kernel/power/energy_model.c  |   32 ++++++++++++++++++++++++++++++++
 2 files changed, 37 insertions(+)

Index: linux-pm/kernel/power/energy_model.c
===================================================================
--- linux-pm.orig/kernel/power/energy_model.c
+++ linux-pm/kernel/power/energy_model.c
@@ -696,6 +696,38 @@ void em_dev_unregister_perf_domain(struc
 }
 EXPORT_SYMBOL_GPL(em_dev_unregister_perf_domain);
 
+/**
+ * em_dev_expand_perf_domain() - Expand CPU perf domain
+ * @dev: CPU device of a CPU in the perf domain.
+ * @new_cpu: CPU to add to the perf domain.
+ */
+int em_dev_expand_perf_domain(struct device *dev, int new_cpu)
+{
+	struct device *new_cpu_dev;
+	struct em_perf_domain *pd;
+
+	if (IS_ERR_OR_NULL(dev) || !_is_cpu_device(dev))
+		return -EINVAL;
+
+	new_cpu_dev = get_cpu_device(new_cpu);
+	if (!new_cpu_dev)
+		return -EINVAL;
+
+	guard(mutex)(&em_pd_mutex);
+
+	if (em_pd_get(new_cpu_dev))
+		return -EEXIST;
+
+	pd = em_pd_get(dev);
+	if (!pd)
+		return -EINVAL;
+
+	cpumask_set_cpu(new_cpu, em_span_cpus(pd));
+	new_cpu_dev->em_pd = pd;
+
+	return 0;
+}
+
 static struct em_perf_table __rcu *em_table_dup(struct em_perf_domain *pd)
 {
 	struct em_perf_table __rcu *em_table;
Index: linux-pm/include/linux/energy_model.h
===================================================================
--- linux-pm.orig/include/linux/energy_model.h
+++ linux-pm/include/linux/energy_model.h
@@ -172,6 +172,7 @@ int em_dev_register_perf_domain(struct d
 				struct em_data_callback *cb, cpumask_t *span,
 				bool microwatts);
 void em_dev_unregister_perf_domain(struct device *dev);
+int em_dev_expand_perf_domain(struct device *dev, int new_cpu);
 struct em_perf_table __rcu *em_table_alloc(struct em_perf_domain *pd);
 void em_table_free(struct em_perf_table __rcu *table);
 int em_dev_compute_costs(struct device *dev, struct em_perf_state *table,
@@ -354,6 +355,10 @@ int em_dev_register_perf_domain(struct d
 static inline void em_dev_unregister_perf_domain(struct device *dev)
 {
 }
+static inline int em_dev_expand_perf_domain(struct device *dev, int new_cpu)
+{
+	return -EINVAL;
+}
 static inline struct em_perf_domain *em_cpu_get(int cpu)
 {
 	return NULL;




^ permalink raw reply	[flat|nested] 22+ messages in thread

* [RFC][PATCH v0.1 5/6] sched/topology: Allow .setpolicy() cpufreq drivers to enable EAS
  2024-11-08 16:09 [RFC][PATCH v0.1 0/6] cpufreq: intel_pstate: Enable EAS on hybrid platforms without SMT Rafael J. Wysocki
                   ` (3 preceding siblings ...)
  2024-11-08 16:40 ` [RFC][PATCH v0.1 4/6] PM: EM: Introduce em_dev_expand_perf_domain() Rafael J. Wysocki
@ 2024-11-08 16:41 ` Rafael J. Wysocki
  2024-11-11 11:54   ` Christian Loehle
  2024-11-08 16:46 ` [RFC][PATCH v0.1 6/6] cpufreq: intel_pstate: Add basic EAS support on hybrid platforms Rafael J. Wysocki
  5 siblings, 1 reply; 22+ messages in thread
From: Rafael J. Wysocki @ 2024-11-08 16:41 UTC (permalink / raw)
  To: Linux PM
  Cc: LKML, Lukasz Luba, Peter Zijlstra, Srinivas Pandruvada, Len Brown,
	Dietmar Eggemann, Morten Rasmussen, Vincent Guittot, Ricardo Neri

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Some cpufreq drivers, like intel_pstate, have built-in governors that
are used instead of regular cpufreq governors, schedutil in particular,
but they can work with EAS just fine, so allow EAS to be used with
those drivers.

Also update the debug message printed when the cpufreq governor in
use is not schedutil and the related comment, to better match the
code after the change.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

I'm not sure how much value there is in refusing to enable EAS without
schedutil in general.  For instance, if there are no crossover points
between the cost curves for different perf domains, EAS may as well be
used with the performance and powersave governors AFAICS.

---
 kernel/sched/topology.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Index: linux-pm/kernel/sched/topology.c
===================================================================
--- linux-pm.orig/kernel/sched/topology.c
+++ linux-pm/kernel/sched/topology.c
@@ -251,7 +251,7 @@ static bool sched_is_eas_possible(const
 		return false;
 	}
 
-	/* Do not attempt EAS if schedutil is not being used. */
+	/* Do not attempt EAS with a cpufreq governor other than schedutil. */
 	for_each_cpu(i, cpu_mask) {
 		policy = cpufreq_cpu_get(i);
 		if (!policy) {
@@ -263,9 +263,9 @@ static bool sched_is_eas_possible(const
 		}
 		gov = policy->governor;
 		cpufreq_cpu_put(policy);
-		if (gov != &schedutil_gov) {
+		if (gov && gov != &schedutil_gov) {
 			if (sched_debug()) {
-				pr_info("rd %*pbl: Checking EAS, schedutil is mandatory\n",
+				pr_info("rd %*pbl: Checking EAS, cpufreq governor is not schedutil\n",
 					cpumask_pr_args(cpu_mask));
 			}
 			return false;




^ permalink raw reply	[flat|nested] 22+ messages in thread

* [RFC][PATCH v0.1 6/6] cpufreq: intel_pstate: Add basic EAS support on hybrid platforms
  2024-11-08 16:09 [RFC][PATCH v0.1 0/6] cpufreq: intel_pstate: Enable EAS on hybrid platforms without SMT Rafael J. Wysocki
                   ` (4 preceding siblings ...)
  2024-11-08 16:41 ` [RFC][PATCH v0.1 5/6] sched/topology: Allow .setpolicy() cpufreq drivers to enable EAS Rafael J. Wysocki
@ 2024-11-08 16:46 ` Rafael J. Wysocki
  2024-11-12  8:21   ` Dietmar Eggemann
  2024-11-18 16:34   ` Pierre Gondois
  5 siblings, 2 replies; 22+ messages in thread
From: Rafael J. Wysocki @ 2024-11-08 16:46 UTC (permalink / raw)
  To: Linux PM
  Cc: LKML, Lukasz Luba, Peter Zijlstra, Srinivas Pandruvada, Len Brown,
	Dietmar Eggemann, Morten Rasmussen, Vincent Guittot, Ricardo Neri

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Modify intel_pstate to register stub EM perf domains for CPUs on
hybrid platforms via em_dev_register_perf_domain() and to use
em_dev_expand_perf_domain() introduced previously for adding new
CPUs to existing EM perf domains when those CPUs become online for
the first time after driver initialization.

This change is targeting platforms (for example, Lunar Lake) where
"small" CPUs (E-cores) are always more energy-efficient than the "big"
or "performance" CPUs (P-cores) when run at the same HWP performance
level, so it is sufficient to tell the EAS that E-cores are always
preferred (so long as there is enough spare capacity on one of them
to run the given task).

Accordingly, the perf domains are registered per CPU type (that is,
all P-cores belong to one perf domain and all E-cores belong to another
perf domain) and they are registered only if asymmetric CPU capacity is
enabled.  Each perf domain has a one-element states table and that
element only contains the relative cost value (the other fields in
it are not initialized, so they are all equal to zero), and the cost
value for the E-core perf domain is lower.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/cpufreq/intel_pstate.c |  110 ++++++++++++++++++++++++++++++++++++++---
 1 file changed, 104 insertions(+), 6 deletions(-)

Index: linux-pm/drivers/cpufreq/intel_pstate.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/intel_pstate.c
+++ linux-pm/drivers/cpufreq/intel_pstate.c
@@ -8,6 +8,7 @@
 
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
+#include <linux/energy_model.h>
 #include <linux/kernel.h>
 #include <linux/kernel_stat.h>
 #include <linux/module.h>
@@ -938,6 +939,12 @@ static struct freq_attr *hwp_cpufreq_att
 	NULL,
 };
 
+enum hybrid_cpu_type {
+	HYBRID_PCORE = 0,
+	HYBRID_ECORE,
+	HYBRID_NR_TYPES
+};
+
 static struct cpudata *hybrid_max_perf_cpu __read_mostly;
 /*
  * Protects hybrid_max_perf_cpu, the capacity_perf fields in struct cpudata,
@@ -945,6 +952,86 @@ static struct cpudata *hybrid_max_perf_c
  */
 static DEFINE_MUTEX(hybrid_capacity_lock);
 
+#ifdef CONFIG_ENERGY_MODEL
+struct hybrid_em_perf_domain {
+	cpumask_t cpumask;
+	struct device *dev;
+	struct em_data_callback cb;
+};
+
+static int hybrid_pcore_cost(struct device *dev, unsigned long freq,
+			     unsigned long *cost)
+{
+	/*
+	 * The number used here needs to be higher than the analogous
+	 * one in hybrid_ecore_cost() below.  The units and the actual
+	 * values don't matter.
+	 */
+	*cost = 2;
+	return 0;
+}
+
+static int hybrid_ecore_cost(struct device *dev, unsigned long freq,
+			     unsigned long *cost)
+{
+	*cost = 1;
+	return 0;
+}
+
+static struct hybrid_em_perf_domain perf_domains[HYBRID_NR_TYPES] = {
+	[HYBRID_PCORE] = { .cb.get_cost = hybrid_pcore_cost, },
+	[HYBRID_ECORE] = { .cb.get_cost = hybrid_ecore_cost, }
+};
+
+static bool hybrid_register_perf_domain(struct hybrid_em_perf_domain *pd)
+{
+	/*
+	 * Registering EM perf domains without asymmetric CPU capacity
+	 * support enabled is wasteful, so don't do that.
+	 */
+	if (!hybrid_max_perf_cpu)
+		return false;
+
+	pd->dev = get_cpu_device(cpumask_first(&pd->cpumask));
+	if (!pd->dev)
+		return false;
+
+	if (em_dev_register_perf_domain(pd->dev, 1, &pd->cb, &pd->cpumask, false)) {
+		pd->dev = NULL;
+		return false;
+	}
+
+	return true;
+}
+
+static void hybrid_register_all_perf_domains(void)
+{
+	enum hybrid_cpu_type type;
+
+	for (type = HYBRID_PCORE; type < HYBRID_NR_TYPES; type++)
+		hybrid_register_perf_domain(&perf_domains[type]);
+}
+
+static void hybrid_add_to_perf_domain(int cpu, enum hybrid_cpu_type type)
+{
+	struct hybrid_em_perf_domain *pd = &perf_domains[type];
+
+	guard(mutex)(&hybrid_capacity_lock);
+
+	if (cpumask_test_cpu(cpu, &pd->cpumask))
+		return;
+
+	cpumask_set_cpu(cpu, &pd->cpumask);
+	if (pd->dev)
+		em_dev_expand_perf_domain(pd->dev, cpu);
+	else if (hybrid_register_perf_domain(pd))
+		em_rebuild_perf_domains();
+}
+#else /* CONFIG_ENERGY_MODEL */
+static inline void hybrid_register_all_perf_domains(void) {}
+static inline void hybrid_add_to_perf_domain(int cpu, enum hybrid_cpu_type type) {}
+#endif /* !CONFIG_ENERGY_MODEL */
+
 static void hybrid_set_cpu_capacity(struct cpudata *cpu)
 {
 	arch_set_cpu_capacity(cpu->cpu, cpu->capacity_perf,
@@ -1034,11 +1121,14 @@ static void __hybrid_refresh_cpu_capacit
 	hybrid_update_cpu_capacity_scaling();
 }
 
-static void hybrid_refresh_cpu_capacity_scaling(void)
+static void hybrid_refresh_cpu_capacity_scaling(bool register_perf_domains)
 {
 	guard(mutex)(&hybrid_capacity_lock);
 
 	__hybrid_refresh_cpu_capacity_scaling();
+
+	if (register_perf_domains)
+		hybrid_register_all_perf_domains();
 }
 
 static void hybrid_init_cpu_capacity_scaling(bool refresh)
@@ -1049,7 +1139,7 @@ static void hybrid_init_cpu_capacity_sca
 	 * operation mode.
 	 */
 	if (refresh) {
-		hybrid_refresh_cpu_capacity_scaling();
+		hybrid_refresh_cpu_capacity_scaling(false);
 		return;
 	}
 
@@ -1059,10 +1149,14 @@ static void hybrid_init_cpu_capacity_sca
 	 * do not do that when SMT is in use.
 	 */
 	if (hwp_is_hybrid && !sched_smt_active() && arch_enable_hybrid_capacity_scale()) {
-		hybrid_refresh_cpu_capacity_scaling();
+		/*
+		 * Perf domains are not registered before setting hybrid_max_perf_cpu,
+		 * so register them all after setting up CPU capacity scaling.
+		 */
+		hybrid_refresh_cpu_capacity_scaling(true);
 		/*
 		 * Disabling ITMT causes sched domains to be rebuilt to disable asym
-		 * packing and enable asym capacity.
+		 * packing and enable asym capacity and EAS.
 		 */
 		sched_clear_itmt_support();
 	}
@@ -2215,12 +2309,16 @@ static int hwp_get_cpu_scaling(int cpu)
 
 	smp_call_function_single(cpu, hybrid_get_type, &cpu_type, 1);
 	/* P-cores have a smaller perf level-to-freqency scaling factor. */
-	if (cpu_type == 0x40)
+	if (cpu_type == 0x40) {
+		hybrid_add_to_perf_domain(cpu, HYBRID_PCORE);
 		return hybrid_scaling_factor;
+	}
 
 	/* Use default core scaling for E-cores */
-	if (cpu_type == 0x20)
+	if (cpu_type == 0x20) {
+		hybrid_add_to_perf_domain(cpu, HYBRID_ECORE);
 		return core_get_scaling();
+	}
 
 	/*
 	 * If reached here, this system is either non-hybrid (like Tiger




^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC][PATCH v0.1 5/6] sched/topology: Allow .setpolicy() cpufreq drivers to enable EAS
  2024-11-08 16:41 ` [RFC][PATCH v0.1 5/6] sched/topology: Allow .setpolicy() cpufreq drivers to enable EAS Rafael J. Wysocki
@ 2024-11-11 11:54   ` Christian Loehle
  2024-11-11 13:54     ` Rafael J. Wysocki
  0 siblings, 1 reply; 22+ messages in thread
From: Christian Loehle @ 2024-11-11 11:54 UTC (permalink / raw)
  To: Rafael J. Wysocki, Linux PM
  Cc: LKML, Lukasz Luba, Peter Zijlstra, Srinivas Pandruvada, Len Brown,
	Dietmar Eggemann, Morten Rasmussen, Vincent Guittot, Ricardo Neri

On 11/8/24 16:41, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Some cpufreq drivers, like intel_pstate, have built-in governors that
> are used instead of regular cpufreq governors, schedutil in particular,
> but they can work with EAS just fine, so allow EAS to be used with
> those drivers.
> 
> Also update the debug message printed when the cpufreq governor in
> use is not schedutil and the related comment, to better match the
> code after the change.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
> 
> I'm not sure how much value there is in refusing to enable EAS without
> schedutil in general.  For instance, if there are no crossover points
> between the cost curves for different perf domains, EAS may as well be
> used with the performance and powersave governors AFAICS.
 
Agreed, but having no cross-over points or no DVFS at all should be the
only instances, right?
For plain (non-intel_pstate) powersave and performance we could replace
sugov_effective_cpu_perf()
that determines the OPP of the perf-domain by the OPP they will be
choosing, but for the rest?
Also there is the entire uclamp thing, not sure what the best
solution is there.
Will intel_pstate just always ignore it? Might be better then to
depend on !intel_pstate?

> ---
>  kernel/sched/topology.c |    6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> Index: linux-pm/kernel/sched/topology.c
> ===================================================================
> --- linux-pm.orig/kernel/sched/topology.c
> +++ linux-pm/kernel/sched/topology.c
> @@ -251,7 +251,7 @@ static bool sched_is_eas_possible(const
>  		return false;
>  	}
>  
> -	/* Do not attempt EAS if schedutil is not being used. */
> +	/* Do not attempt EAS with a cpufreq governor other than schedutil. */
>  	for_each_cpu(i, cpu_mask) {
>  		policy = cpufreq_cpu_get(i);
>  		if (!policy) {
> @@ -263,9 +263,9 @@ static bool sched_is_eas_possible(const
>  		}
>  		gov = policy->governor;
>  		cpufreq_cpu_put(policy);
> -		if (gov != &schedutil_gov) {
> +		if (gov && gov != &schedutil_gov) {
>  			if (sched_debug()) {
> -				pr_info("rd %*pbl: Checking EAS, schedutil is mandatory\n",
> +				pr_info("rd %*pbl: Checking EAS, cpufreq governor is not schedutil\n",
>  					cpumask_pr_args(cpu_mask));
>  			}
>  			return false;
> 
> 
> 
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC][PATCH v0.1 5/6] sched/topology: Allow .setpolicy() cpufreq drivers to enable EAS
  2024-11-11 11:54   ` Christian Loehle
@ 2024-11-11 13:54     ` Rafael J. Wysocki
  2024-11-19 15:13       ` Vincent Guittot
  2024-11-19 17:37       ` Peter Zijlstra
  0 siblings, 2 replies; 22+ messages in thread
From: Rafael J. Wysocki @ 2024-11-11 13:54 UTC (permalink / raw)
  To: Christian Loehle
  Cc: Rafael J. Wysocki, Linux PM, LKML, Lukasz Luba, Peter Zijlstra,
	Srinivas Pandruvada, Len Brown, Dietmar Eggemann,
	Morten Rasmussen, Vincent Guittot, Ricardo Neri

On Mon, Nov 11, 2024 at 12:54 PM Christian Loehle
<christian.loehle@arm.com> wrote:
>
> On 11/8/24 16:41, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > Some cpufreq drivers, like intel_pstate, have built-in governors that
> > are used instead of regular cpufreq governors, schedutil in particular,
> > but they can work with EAS just fine, so allow EAS to be used with
> > those drivers.
> >
> > Also update the debug message printed when the cpufreq governor in
> > use is not schedutil and the related comment, to better match the
> > code after the change.
> >
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >
> > I'm not sure how much value there is in refusing to enable EAS without
> > schedutil in general.  For instance, if there are no crossover points
> > between the cost curves for different perf domains, EAS may as well be
> > used with the performance and powersave governors AFAICS.
>
> Agreed, but having no cross-over points or no DVFS at all should be the
> only instances, right?

Not really.  This is the most obvious case, but there are other less
obvious ones.

Say there are two cross-over points: The  "performance" and
"powersave" governors should still be fine with EAS in that case.

Or what if somebody has a governor in user space that generally
behaves like schedutil?

Or what about ondemand?  Is it alway completely broken with EAS?

> For plain (non-intel_pstate) powersave and performance we could replace
> sugov_effective_cpu_perf()
> that determines the OPP of the perf-domain by the OPP they will be
> choosing, but for the rest?

I generally think that depending on schedutil for EAS is a mistake.

I would just print a warning that results may be suboptimal or
generally not as expected if the cpufreq governor is not schedutil
instead of preventing EAS from running at all.

> Also there is the entire uclamp thing, not sure what the best
> solution is there.
> Will intel_pstate just always ignore it? Might be better then to
> depend on !intel_pstate?

Well, it can be made dependent on policy->policy ==
CPUFREQ_POLICY_POWERSAVE if gov is NULL or similar, but honestly why
bother?

> > ---
> >  kernel/sched/topology.c |    6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > Index: linux-pm/kernel/sched/topology.c
> > ===================================================================
> > --- linux-pm.orig/kernel/sched/topology.c
> > +++ linux-pm/kernel/sched/topology.c
> > @@ -251,7 +251,7 @@ static bool sched_is_eas_possible(const
> >               return false;
> >       }
> >
> > -     /* Do not attempt EAS if schedutil is not being used. */
> > +     /* Do not attempt EAS with a cpufreq governor other than schedutil. */
> >       for_each_cpu(i, cpu_mask) {
> >               policy = cpufreq_cpu_get(i);
> >               if (!policy) {
> > @@ -263,9 +263,9 @@ static bool sched_is_eas_possible(const
> >               }
> >               gov = policy->governor;
> >               cpufreq_cpu_put(policy);
> > -             if (gov != &schedutil_gov) {
> > +             if (gov && gov != &schedutil_gov) {
> >                       if (sched_debug()) {
> > -                             pr_info("rd %*pbl: Checking EAS, schedutil is mandatory\n",
> > +                             pr_info("rd %*pbl: Checking EAS, cpufreq governor is not schedutil\n",
> >                                       cpumask_pr_args(cpu_mask));
> >                       }
> >                       return false;
> >
> >
> >
> >
>
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC][PATCH v0.1 2/6] PM: EM: Call em_compute_costs() from em_create_perf_table()
  2024-11-08 16:37 ` [RFC][PATCH v0.1 2/6] PM: EM: Call em_compute_costs() from em_create_perf_table() Rafael J. Wysocki
@ 2024-11-12  8:21   ` Dietmar Eggemann
  0 siblings, 0 replies; 22+ messages in thread
From: Dietmar Eggemann @ 2024-11-12  8:21 UTC (permalink / raw)
  To: Rafael J. Wysocki, Linux PM
  Cc: LKML, Lukasz Luba, Peter Zijlstra, Srinivas Pandruvada, Len Brown,
	Morten Rasmussen, Vincent Guittot, Ricardo Neri

On 08/11/2024 17:37, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> In preparation for subsequent changes, move the em_compute_costs()
> invocation from em_create_perf_table() to em_create_pd() which is its
> only caller.

You have to do this since em_create_perf_table() is only called when 'if
(cb->active_power) != NULL'.

And 'cb->active_power == NULL' is what you use for your new 'stub' PD case.

Maybe worth mentioning already here? You do mention this in the
following patch though.

[...]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC][PATCH v0.1 3/6] PM: EM: Add special case to em_dev_register_perf_domain()
  2024-11-08 16:38 ` [RFC][PATCH v0.1 3/6] PM: EM: Add special case to em_dev_register_perf_domain() Rafael J. Wysocki
@ 2024-11-12  8:21   ` Dietmar Eggemann
  2024-11-18 15:24   ` Hongyan Xia
  1 sibling, 0 replies; 22+ messages in thread
From: Dietmar Eggemann @ 2024-11-12  8:21 UTC (permalink / raw)
  To: Rafael J. Wysocki, Linux PM
  Cc: LKML, Lukasz Luba, Peter Zijlstra, Srinivas Pandruvada, Len Brown,
	Morten Rasmussen, Vincent Guittot, Ricardo Neri

On 08/11/2024 17:38, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Allow em_dev_register_perf_domain() to register a cost-only stub
> perf domain with one-element states table if the .active_power()
> callback is not provided.
> 
> Subsequently, this will be used by the intel_pstate driver to register
> stub perf domains for CPUs on hybrid systems.

Looks like a 'stub' PD only distinguish itself from a normal PD by not
checking that all CPU in that PD have the same CPU capacity value?

I assume you do this since the Performance Cores (CPUs) can have
different CPU capacity values due to slightly different 'itmt prio' values?

So strictly speaking such a Intel hybrid machine would be tri-gear
system to fit the definition of a PD.

I thought initially by reading the word 'stub' that you only setup a
part of the default EM infrastructure.

[...]


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC][PATCH v0.1 6/6] cpufreq: intel_pstate: Add basic EAS support on hybrid platforms
  2024-11-08 16:46 ` [RFC][PATCH v0.1 6/6] cpufreq: intel_pstate: Add basic EAS support on hybrid platforms Rafael J. Wysocki
@ 2024-11-12  8:21   ` Dietmar Eggemann
  2024-11-19 14:38     ` Rafael J. Wysocki
  2024-11-18 16:34   ` Pierre Gondois
  1 sibling, 1 reply; 22+ messages in thread
From: Dietmar Eggemann @ 2024-11-12  8:21 UTC (permalink / raw)
  To: Rafael J. Wysocki, Linux PM
  Cc: LKML, Lukasz Luba, Peter Zijlstra, Srinivas Pandruvada, Len Brown,
	Morten Rasmussen, Vincent Guittot, Ricardo Neri

On 08/11/2024 17:46, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Modify intel_pstate to register stub EM perf domains for CPUs on
> hybrid platforms via em_dev_register_perf_domain() and to use
> em_dev_expand_perf_domain() introduced previously for adding new
> CPUs to existing EM perf domains when those CPUs become online for
> the first time after driver initialization.
> 
> This change is targeting platforms (for example, Lunar Lake) where
> "small" CPUs (E-cores) are always more energy-efficient than the "big"
> or "performance" CPUs (P-cores) when run at the same HWP performance
> level, so it is sufficient to tell the EAS that E-cores are always
> preferred (so long as there is enough spare capacity on one of them
> to run the given task).

By treating all big CPUs (ignoring the different itmt prio values
between them) we would have a system in which PD's are not in sync with
the asym_cap_list* or the CPU capacities of individual CPUs and sched
groups within the sched domain. Not sure if we want to go this way?

* used by misfit handling - 22d5607400c6 ("sched/fair: Check if a task
has a fitting CPU when updating misfit")

> Accordingly, the perf domains are registered per CPU type (that is,
> all P-cores belong to one perf domain and all E-cores belong to another
> perf domain) and they are registered only if asymmetric CPU capacity is
> enabled.  Each perf domain has a one-element states table and that
> element only contains the relative cost value (the other fields in
> it are not initialized, so they are all equal to zero), and the cost
> value for the E-core perf domain is lower.

[...]

> +static int hybrid_pcore_cost(struct device *dev, unsigned long freq,
> +			     unsigned long *cost)
> +{
> +	/*
> +	 * The number used here needs to be higher than the analogous
> +	 * one in hybrid_ecore_cost() below.  The units and the actual
> +	 * values don't matter.
> +	 */
> +	*cost = 2;
> +	return 0;

So you're not tying this to HFI energy scores?

> +}
> +
> +static int hybrid_ecore_cost(struct device *dev, unsigned long freq,
> +			     unsigned long *cost)
> +{
> +	*cost = 1;
> +	return 0;
> +}
> +
> +static struct hybrid_em_perf_domain perf_domains[HYBRID_NR_TYPES] = {
> +	[HYBRID_PCORE] = { .cb.get_cost = hybrid_pcore_cost, },
> +	[HYBRID_ECORE] = { .cb.get_cost = hybrid_ecore_cost, }
> +};
> +
> +static bool hybrid_register_perf_domain(struct hybrid_em_perf_domain *pd)
> +{
> +	/*
> +	 * Registering EM perf domains without asymmetric CPU capacity
> +	 * support enabled is wasteful, so don't do that.
> +	 */
> +	if (!hybrid_max_perf_cpu)
> +		return false;
> +
> +	pd->dev = get_cpu_device(cpumask_first(&pd->cpumask));
> +	if (!pd->dev)
> +		return false;
> +
> +	if (em_dev_register_perf_domain(pd->dev, 1, &pd->cb, &pd->cpumask, false)) {
> +		pd->dev = NULL;
> +		return false;
> +	}
> +
> +	return true;
> +}

What are the issues in case you would use the existing ways (non-stub)
to setup the EM?

static int intel_pstate_get_cpu_cost()

static void intel_pstate_register_em(struct cpufreq_policy *policy)

  struct em_data_callback em_cb = EM_ADV_DATA_CB(NULL,
                                              intel_pstate_get_cpu_cost)

  em_dev_register_perf_domain(get_cpu_device(policy->cpu), 1,
                              &em_cb, policy->related_cpus, 1);
                                      ^^^^^^^^^^^^^^^^^^^^*

static void intel_pstate_set_register_em_fct(void)

  default_driver->register_em = intel_pstate_register_em

static int __init intel_pstate_init(void)

  ...
  intel_pstate_set_register_em_fct()
  ...

I guess one issue is the per-CPU policy as an argument to
em_dev_register_perf_domain() (*) ?

> +static void hybrid_register_all_perf_domains(void)
> +{
> +	enum hybrid_cpu_type type;
> +
> +	for (type = HYBRID_PCORE; type < HYBRID_NR_TYPES; type++)
> +		hybrid_register_perf_domain(&perf_domains[type]);
> +}
> +
> +static void hybrid_add_to_perf_domain(int cpu, enum hybrid_cpu_type type)
> +{
> +	struct hybrid_em_perf_domain *pd = &perf_domains[type];
> +
> +	guard(mutex)(&hybrid_capacity_lock);
> +
> +	if (cpumask_test_cpu(cpu, &pd->cpumask))
> +		return;
> +
> +	cpumask_set_cpu(cpu, &pd->cpumask);
> +	if (pd->dev)
> +		em_dev_expand_perf_domain(pd->dev, cpu);
> +	else if (hybrid_register_perf_domain(pd))
> +		em_rebuild_perf_domains();

I assume that the 'if' and the 'else if' condition here are only taken
when the CPU is brought online after boot?

[...]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC][PATCH v0.1 3/6] PM: EM: Add special case to em_dev_register_perf_domain()
  2024-11-08 16:38 ` [RFC][PATCH v0.1 3/6] PM: EM: Add special case to em_dev_register_perf_domain() Rafael J. Wysocki
  2024-11-12  8:21   ` Dietmar Eggemann
@ 2024-11-18 15:24   ` Hongyan Xia
  2024-11-19 13:51     ` Rafael J. Wysocki
  1 sibling, 1 reply; 22+ messages in thread
From: Hongyan Xia @ 2024-11-18 15:24 UTC (permalink / raw)
  To: Rafael J. Wysocki, Linux PM
  Cc: LKML, Lukasz Luba, Peter Zijlstra, Srinivas Pandruvada, Len Brown,
	Dietmar Eggemann, Morten Rasmussen, Vincent Guittot, Ricardo Neri

On 08/11/2024 16:38, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Allow em_dev_register_perf_domain() to register a cost-only stub
> perf domain with one-element states table if the .active_power()
> callback is not provided.
> 
> Subsequently, this will be used by the intel_pstate driver to register
> stub perf domains for CPUs on hybrid systems.
> 
> No intentional functional impact.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>   kernel/power/energy_model.c |   26 +++++++++++++++++++++++---
>   1 file changed, 23 insertions(+), 3 deletions(-)
> 
> Index: linux-pm/kernel/power/energy_model.c
> ===================================================================
> --- linux-pm.orig/kernel/power/energy_model.c
> +++ linux-pm/kernel/power/energy_model.c
> @@ -426,9 +426,11 @@ static int em_create_pd(struct device *d
>   	if (!em_table)
>   		goto free_pd;
>   
> -	ret = em_create_perf_table(dev, pd, em_table->state, cb, flags);
> -	if (ret)
> -		goto free_pd_table;
> +	if (cb->active_power) {
> +		ret = em_create_perf_table(dev, pd, em_table->state, cb, flags);
> +		if (ret)
> +			goto free_pd_table;
> +	}
>   
>   	ret = em_compute_costs(dev, em_table->state, cb, nr_states, flags);
>   	if (ret)
> @@ -561,11 +563,20 @@ int em_dev_register_perf_domain(struct d
>   {
>   	unsigned long cap, prev_cap = 0;
>   	unsigned long flags = 0;
> +	bool stub_pd = false;
>   	int cpu, ret;
>   
>   	if (!dev || !nr_states || !cb)
>   		return -EINVAL;
>   
> +	if (!cb->active_power) {
> +		if (!cb->get_cost || nr_states > 1 || microwatts)
> +			return -EINVAL;
> +
> +		/* Special case: a stub perf domain. */
> +		stub_pd = true;
> +	}
> +

I wonder if the only purpose of stub_pd is to just skip the capacity 
check below, which doesn't look very nice.

I may be echoing Dietmar's comments here. What's the problem of just 
having 3 domains?

Or, could you just specify the same capacities so that the same-capacity 
check won't fail, but just to use hardware load or CPU pressure to model 
the slight difference in real capacities? This way you'd re-use a lot of 
existing infrastructure.

>   	/*
>   	 * Use a mutex to serialize the registration of performance domains and
>   	 * let the driver-defined callback functions sleep.
> @@ -590,6 +601,15 @@ int em_dev_register_perf_domain(struct d
>   				ret = -EEXIST;
>   				goto unlock;
>   			}
> +
> +			/*
> +			 * The capacity need not be the same for all CPUs in a
> +			 * stub perf domain, so long as the average cost of
> +			 * running on each of them is approximately the same.
> +			 */
> +			if (stub_pd)
> +				continue;
> +
>   			/*
>   			 * All CPUs of a domain must have the same
>   			 * micro-architecture since they all share the same
> 
> 
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC][PATCH v0.1 6/6] cpufreq: intel_pstate: Add basic EAS support on hybrid platforms
  2024-11-08 16:46 ` [RFC][PATCH v0.1 6/6] cpufreq: intel_pstate: Add basic EAS support on hybrid platforms Rafael J. Wysocki
  2024-11-12  8:21   ` Dietmar Eggemann
@ 2024-11-18 16:34   ` Pierre Gondois
  2024-11-19 17:20     ` Rafael J. Wysocki
  1 sibling, 1 reply; 22+ messages in thread
From: Pierre Gondois @ 2024-11-18 16:34 UTC (permalink / raw)
  To: Rafael J. Wysocki, Linux PM
  Cc: LKML, Lukasz Luba, Peter Zijlstra, Srinivas Pandruvada, Len Brown,
	Dietmar Eggemann, Morten Rasmussen, Vincent Guittot, Ricardo Neri,
	Christian Loehle



On 11/8/24 17:46, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Modify intel_pstate to register stub EM perf domains for CPUs on
> hybrid platforms via em_dev_register_perf_domain() and to use
> em_dev_expand_perf_domain() introduced previously for adding new
> CPUs to existing EM perf domains when those CPUs become online for
> the first time after driver initialization.
> 
> This change is targeting platforms (for example, Lunar Lake) where
> "small" CPUs (E-cores) are always more energy-efficient than the "big"
> or "performance" CPUs (P-cores) when run at the same HWP performance
> level, so it is sufficient to tell the EAS that E-cores are always
> preferred (so long as there is enough spare capacity on one of them
> to run the given task).
> 
> Accordingly, the perf domains are registered per CPU type (that is,
> all P-cores belong to one perf domain and all E-cores belong to another
> perf domain) and they are registered only if asymmetric CPU capacity is
> enabled.  Each perf domain has a one-element states table and that
> element only contains the relative cost value (the other fields in
> it are not initialized, so they are all equal to zero), and the cost
> value for the E-core perf domain is lower.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>   drivers/cpufreq/intel_pstate.c |  110 ++++++++++++++++++++++++++++++++++++++---
>   1 file changed, 104 insertions(+), 6 deletions(-)
> 
> Index: linux-pm/drivers/cpufreq/intel_pstate.c
> ===================================================================
> --- linux-pm.orig/drivers/cpufreq/intel_pstate.c
> +++ linux-pm/drivers/cpufreq/intel_pstate.c
> @@ -8,6 +8,7 @@
>   
>   #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>   
> +#include <linux/energy_model.h>
>   #include <linux/kernel.h>
>   #include <linux/kernel_stat.h>
>   #include <linux/module.h>
> @@ -938,6 +939,12 @@ static struct freq_attr *hwp_cpufreq_att
>   	NULL,
>   };
>   
> +enum hybrid_cpu_type {
> +	HYBRID_PCORE = 0,
> +	HYBRID_ECORE,
> +	HYBRID_NR_TYPES
> +};
> +
>   static struct cpudata *hybrid_max_perf_cpu __read_mostly;
>   /*
>    * Protects hybrid_max_perf_cpu, the capacity_perf fields in struct cpudata,
> @@ -945,6 +952,86 @@ static struct cpudata *hybrid_max_perf_c
>    */
>   static DEFINE_MUTEX(hybrid_capacity_lock);
>   
> +#ifdef CONFIG_ENERGY_MODEL
> +struct hybrid_em_perf_domain {
> +	cpumask_t cpumask;
> +	struct device *dev;
> +	struct em_data_callback cb;
> +};
> +
> +static int hybrid_pcore_cost(struct device *dev, unsigned long freq,
> +			     unsigned long *cost)
> +{
> +	/*
> +	 * The number used here needs to be higher than the analogous
> +	 * one in hybrid_ecore_cost() below.  The units and the actual
> +	 * values don't matter.
> +	 */
> +	*cost = 2;
> +	return 0;
> +}
> +
> +static int hybrid_ecore_cost(struct device *dev, unsigned long freq,
> +			     unsigned long *cost)
> +{
> +	*cost = 1;
> +	return 0;
> +}

The artificial EM was introduced for CPPC based platforms since these platforms
only provide an 'efficiency class' entry to describe the relative efficiency
of CPUs. The case seems similar to yours.
'Fake' OPPs were created to have an incentive for EAS to balance the load on
the CPUs in one perf. domain. Indeed, in feec(), during the energy
computation of a pd, if the cost is independent from the max_util value,
then one CPU in the pd could end up having a high util, and another CPU a
NULL util.
For CPPC platforms, this was problematic as a lower OPP could have been
selected for the same load, so energy was lost for no reason.

In your case it seems that the OPP selection is done independently on each
CPU. However I assume it is still more energy efficient to have 2 CPUs
loaded at 50% than one CPU loaded at 100% and an idle CPU.

Also as Dietmar suggested, maybe it would make sense to have some
way to prefer an CPU with a "energy saving" HFI configuration than
a similar CPU with a "performance" HFI configuration.

Also, out of curiosity, do you have energy numbers to share ?

Regards,
Pierre

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC][PATCH v0.1 3/6] PM: EM: Add special case to em_dev_register_perf_domain()
  2024-11-18 15:24   ` Hongyan Xia
@ 2024-11-19 13:51     ` Rafael J. Wysocki
  0 siblings, 0 replies; 22+ messages in thread
From: Rafael J. Wysocki @ 2024-11-19 13:51 UTC (permalink / raw)
  To: Hongyan Xia
  Cc: Rafael J. Wysocki, Linux PM, LKML, Lukasz Luba, Peter Zijlstra,
	Srinivas Pandruvada, Len Brown, Dietmar Eggemann,
	Morten Rasmussen, Vincent Guittot, Ricardo Neri

On Mon, Nov 18, 2024 at 4:25 PM Hongyan Xia <hongyan.xia2@arm.com> wrote:
>
> On 08/11/2024 16:38, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > Allow em_dev_register_perf_domain() to register a cost-only stub
> > perf domain with one-element states table if the .active_power()
> > callback is not provided.
> >
> > Subsequently, this will be used by the intel_pstate driver to register
> > stub perf domains for CPUs on hybrid systems.
> >
> > No intentional functional impact.
> >
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >   kernel/power/energy_model.c |   26 +++++++++++++++++++++++---
> >   1 file changed, 23 insertions(+), 3 deletions(-)
> >
> > Index: linux-pm/kernel/power/energy_model.c
> > ===================================================================
> > --- linux-pm.orig/kernel/power/energy_model.c
> > +++ linux-pm/kernel/power/energy_model.c
> > @@ -426,9 +426,11 @@ static int em_create_pd(struct device *d
> >       if (!em_table)
> >               goto free_pd;
> >
> > -     ret = em_create_perf_table(dev, pd, em_table->state, cb, flags);
> > -     if (ret)
> > -             goto free_pd_table;
> > +     if (cb->active_power) {
> > +             ret = em_create_perf_table(dev, pd, em_table->state, cb, flags);
> > +             if (ret)
> > +                     goto free_pd_table;
> > +     }
> >
> >       ret = em_compute_costs(dev, em_table->state, cb, nr_states, flags);
> >       if (ret)
> > @@ -561,11 +563,20 @@ int em_dev_register_perf_domain(struct d
> >   {
> >       unsigned long cap, prev_cap = 0;
> >       unsigned long flags = 0;
> > +     bool stub_pd = false;
> >       int cpu, ret;
> >
> >       if (!dev || !nr_states || !cb)
> >               return -EINVAL;
> >
> > +     if (!cb->active_power) {
> > +             if (!cb->get_cost || nr_states > 1 || microwatts)
> > +                     return -EINVAL;
> > +
> > +             /* Special case: a stub perf domain. */
> > +             stub_pd = true;
> > +     }
> > +
>
> I wonder if the only purpose of stub_pd is to just skip the capacity
> check below, which doesn't look very nice.

It is.

I guess I could just skip it if nr_states == 1 because that case means
the same cost for all frequency values.

>
> I may be echoing Dietmar's comments here. What's the problem of just
> having 3 domains?

The energy-efficiency of a CPU is not strictly related to its capacity.

It's about the cases when there are some special CPUs that can
turbo-up higher, but there's no other difference between them and the
other CPUs in the domain.

> Or, could you just specify the same capacities so that the same-capacity
> check won't fail, but just to use hardware load or CPU pressure to model
> the slight difference in real capacities? This way you'd re-use a lot of
> existing infrastructure.

That would have been confusing though, so thanks, but no thanks.

> >       /*
> >        * Use a mutex to serialize the registration of performance domains and
> >        * let the driver-defined callback functions sleep.
> > @@ -590,6 +601,15 @@ int em_dev_register_perf_domain(struct d
> >                               ret = -EEXIST;
> >                               goto unlock;
> >                       }
> > +
> > +                     /*
> > +                      * The capacity need not be the same for all CPUs in a
> > +                      * stub perf domain, so long as the average cost of
> > +                      * running on each of them is approximately the same.
> > +                      */
> > +                     if (stub_pd)
> > +                             continue;
> > +
> >                       /*
> >                        * All CPUs of a domain must have the same
> >                        * micro-architecture since they all share the same
> >
> >
> >
>
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC][PATCH v0.1 6/6] cpufreq: intel_pstate: Add basic EAS support on hybrid platforms
  2024-11-12  8:21   ` Dietmar Eggemann
@ 2024-11-19 14:38     ` Rafael J. Wysocki
  0 siblings, 0 replies; 22+ messages in thread
From: Rafael J. Wysocki @ 2024-11-19 14:38 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: Rafael J. Wysocki, Linux PM, LKML, Lukasz Luba, Peter Zijlstra,
	Srinivas Pandruvada, Len Brown, Morten Rasmussen, Vincent Guittot,
	Ricardo Neri

First off, thanks for all  the feedback!

On Tue, Nov 12, 2024 at 9:21 AM Dietmar Eggemann
<dietmar.eggemann@arm.com> wrote:
>
> On 08/11/2024 17:46, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > Modify intel_pstate to register stub EM perf domains for CPUs on
> > hybrid platforms via em_dev_register_perf_domain() and to use
> > em_dev_expand_perf_domain() introduced previously for adding new
> > CPUs to existing EM perf domains when those CPUs become online for
> > the first time after driver initialization.
> >
> > This change is targeting platforms (for example, Lunar Lake) where
> > "small" CPUs (E-cores) are always more energy-efficient than the "big"
> > or "performance" CPUs (P-cores) when run at the same HWP performance
> > level, so it is sufficient to tell the EAS that E-cores are always
> > preferred (so long as there is enough spare capacity on one of them
> > to run the given task).
>
> By treating all big CPUs (ignoring the different itmt prio values
> between them) we would have a system in which PD's are not in sync with
> the asym_cap_list* or the CPU capacities of individual CPUs and sched
> groups within the sched domain. Not sure if we want to go this way?

I guess you want the biggest tasks to be scheduled at the most-capable CPUs.

That's fair, and it may even improve single-threaded performance in
some cases I suppose, but then the cost for the "favored cores" PD
would be the same as for the "other P-cores" PD because there is no
difference between them other than the top-most turbo bin, so we'd
compare PDs with the same cost and that wouldn't be super-useful.

> * used by misfit handling - 22d5607400c6 ("sched/fair: Check if a task
> has a fitting CPU when updating misfit")
>
> > Accordingly, the perf domains are registered per CPU type (that is,
> > all P-cores belong to one perf domain and all E-cores belong to another
> > perf domain) and they are registered only if asymmetric CPU capacity is
> > enabled.  Each perf domain has a one-element states table and that
> > element only contains the relative cost value (the other fields in
> > it are not initialized, so they are all equal to zero), and the cost
> > value for the E-core perf domain is lower.
>
> [...]
>
> > +static int hybrid_pcore_cost(struct device *dev, unsigned long freq,
> > +                          unsigned long *cost)
> > +{
> > +     /*
> > +      * The number used here needs to be higher than the analogous
> > +      * one in hybrid_ecore_cost() below.  The units and the actual
> > +      * values don't matter.
> > +      */
> > +     *cost = 2;
> > +     return 0;
>
> So you're not tying this to HFI energy scores?

Not at this time.

> > +}
> > +
> > +static int hybrid_ecore_cost(struct device *dev, unsigned long freq,
> > +                          unsigned long *cost)
> > +{
> > +     *cost = 1;
> > +     return 0;
> > +}
> > +
> > +static struct hybrid_em_perf_domain perf_domains[HYBRID_NR_TYPES] = {
> > +     [HYBRID_PCORE] = { .cb.get_cost = hybrid_pcore_cost, },
> > +     [HYBRID_ECORE] = { .cb.get_cost = hybrid_ecore_cost, }
> > +};
> > +
> > +static bool hybrid_register_perf_domain(struct hybrid_em_perf_domain *pd)
> > +{
> > +     /*
> > +      * Registering EM perf domains without asymmetric CPU capacity
> > +      * support enabled is wasteful, so don't do that.
> > +      */
> > +     if (!hybrid_max_perf_cpu)
> > +             return false;
> > +
> > +     pd->dev = get_cpu_device(cpumask_first(&pd->cpumask));
> > +     if (!pd->dev)
> > +             return false;
> > +
> > +     if (em_dev_register_perf_domain(pd->dev, 1, &pd->cb, &pd->cpumask, false)) {
> > +             pd->dev = NULL;
> > +             return false;
> > +     }
> > +
> > +     return true;
> > +}
>
> What are the issues in case you would use the existing ways (non-stub)
> to setup the EM?
>
> static int intel_pstate_get_cpu_cost()
>
> static void intel_pstate_register_em(struct cpufreq_policy *policy)
>
>   struct em_data_callback em_cb = EM_ADV_DATA_CB(NULL,
>                                               intel_pstate_get_cpu_cost)
>
>   em_dev_register_perf_domain(get_cpu_device(policy->cpu), 1,
>                               &em_cb, policy->related_cpus, 1);
>                                       ^^^^^^^^^^^^^^^^^^^^*

I'm not sure what you are asking about here, but I'll try to answer.

No, I don't want to register a PD per policy with one CPU in it
because that would mean useless comparing PDs with the same cost and
CPU capacity.

> static void intel_pstate_set_register_em_fct(void)
>
>   default_driver->register_em = intel_pstate_register_em

No, I don't want to register PDs through cpufreq because it is too
early (and see above).

> static int __init intel_pstate_init(void)
>
>   ...
>   intel_pstate_set_register_em_fct()
>   ...
>
> I guess one issue is the per-CPU policy as an argument to
> em_dev_register_perf_domain() (*) ?

Yes.

> > +static void hybrid_register_all_perf_domains(void)
> > +{
> > +     enum hybrid_cpu_type type;
> > +
> > +     for (type = HYBRID_PCORE; type < HYBRID_NR_TYPES; type++)
> > +             hybrid_register_perf_domain(&perf_domains[type]);
> > +}
> > +
> > +static void hybrid_add_to_perf_domain(int cpu, enum hybrid_cpu_type type)
> > +{
> > +     struct hybrid_em_perf_domain *pd = &perf_domains[type];
> > +
> > +     guard(mutex)(&hybrid_capacity_lock);
> > +
> > +     if (cpumask_test_cpu(cpu, &pd->cpumask))
> > +             return;
> > +
> > +     cpumask_set_cpu(cpu, &pd->cpumask);
> > +     if (pd->dev)
> > +             em_dev_expand_perf_domain(pd->dev, cpu);
> > +     else if (hybrid_register_perf_domain(pd))
> > +             em_rebuild_perf_domains();
>
> I assume that the 'if' and the 'else if' condition here are only taken
> when the CPU is brought online after boot?

Yes.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC][PATCH v0.1 5/6] sched/topology: Allow .setpolicy() cpufreq drivers to enable EAS
  2024-11-11 13:54     ` Rafael J. Wysocki
@ 2024-11-19 15:13       ` Vincent Guittot
  2024-11-19 17:37       ` Peter Zijlstra
  1 sibling, 0 replies; 22+ messages in thread
From: Vincent Guittot @ 2024-11-19 15:13 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Christian Loehle, Rafael J. Wysocki, Linux PM, LKML, Lukasz Luba,
	Peter Zijlstra, Srinivas Pandruvada, Len Brown, Dietmar Eggemann,
	Morten Rasmussen, Ricardo Neri

On Mon, 11 Nov 2024 at 14:54, Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Mon, Nov 11, 2024 at 12:54 PM Christian Loehle
> <christian.loehle@arm.com> wrote:
> >
> > On 11/8/24 16:41, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > >
> > > Some cpufreq drivers, like intel_pstate, have built-in governors that
> > > are used instead of regular cpufreq governors, schedutil in particular,
> > > but they can work with EAS just fine, so allow EAS to be used with
> > > those drivers.
> > >
> > > Also update the debug message printed when the cpufreq governor in
> > > use is not schedutil and the related comment, to better match the
> > > code after the change.
> > >
> > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > ---
> > >
> > > I'm not sure how much value there is in refusing to enable EAS without
> > > schedutil in general.  For instance, if there are no crossover points
> > > between the cost curves for different perf domains, EAS may as well be
> > > used with the performance and powersave governors AFAICS.
> >
> > Agreed, but having no cross-over points or no DVFS at all should be the
> > only instances, right?
>
> Not really.  This is the most obvious case, but there are other less
> obvious ones.
>
> Say there are two cross-over points: The  "performance" and
> "powersave" governors should still be fine with EAS in that case.
>
> Or what if somebody has a governor in user space that generally
> behaves like schedutil?
>
> Or what about ondemand?  Is it alway completely broken with EAS?

The only requirement from EAS is to know which OPP and its cost will
be selected by cpufreq gov for an utilization level of the CPU.
sched_util provides it with sugov_effective_cpu_perf(). Any other gov
that can provide such estimate of the OPP and associated cost should
be ok

powersave and perf should be pretty obvious not so sure for ondemand

>
> > For plain (non-intel_pstate) powersave and performance we could replace
> > sugov_effective_cpu_perf()
> > that determines the OPP of the perf-domain by the OPP they will be
> > choosing, but for the rest?
>
> I generally think that depending on schedutil for EAS is a mistake.
>
> I would just print a warning that results may be suboptimal or
> generally not as expected if the cpufreq governor is not schedutil
> instead of preventing EAS from running at all.
>
> > Also there is the entire uclamp thing, not sure what the best
> > solution is there.
> > Will intel_pstate just always ignore it? Might be better then to
> > depend on !intel_pstate?
>
> Well, it can be made dependent on policy->policy ==
> CPUFREQ_POLICY_POWERSAVE if gov is NULL or similar, but honestly why
> bother?
>
> > > ---
> > >  kernel/sched/topology.c |    6 +++---
> > >  1 file changed, 3 insertions(+), 3 deletions(-)
> > >
> > > Index: linux-pm/kernel/sched/topology.c
> > > ===================================================================
> > > --- linux-pm.orig/kernel/sched/topology.c
> > > +++ linux-pm/kernel/sched/topology.c
> > > @@ -251,7 +251,7 @@ static bool sched_is_eas_possible(const
> > >               return false;
> > >       }
> > >
> > > -     /* Do not attempt EAS if schedutil is not being used. */
> > > +     /* Do not attempt EAS with a cpufreq governor other than schedutil. */
> > >       for_each_cpu(i, cpu_mask) {
> > >               policy = cpufreq_cpu_get(i);
> > >               if (!policy) {
> > > @@ -263,9 +263,9 @@ static bool sched_is_eas_possible(const
> > >               }
> > >               gov = policy->governor;
> > >               cpufreq_cpu_put(policy);
> > > -             if (gov != &schedutil_gov) {
> > > +             if (gov && gov != &schedutil_gov) {
> > >                       if (sched_debug()) {
> > > -                             pr_info("rd %*pbl: Checking EAS, schedutil is mandatory\n",
> > > +                             pr_info("rd %*pbl: Checking EAS, cpufreq governor is not schedutil\n",
> > >                                       cpumask_pr_args(cpu_mask));
> > >                       }
> > >                       return false;
> > >
> > >
> > >
> > >
> >
> >

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC][PATCH v0.1 6/6] cpufreq: intel_pstate: Add basic EAS support on hybrid platforms
  2024-11-18 16:34   ` Pierre Gondois
@ 2024-11-19 17:20     ` Rafael J. Wysocki
  2024-12-16 15:32       ` Pierre Gondois
  0 siblings, 1 reply; 22+ messages in thread
From: Rafael J. Wysocki @ 2024-11-19 17:20 UTC (permalink / raw)
  To: Pierre Gondois
  Cc: Rafael J. Wysocki, Linux PM, LKML, Lukasz Luba, Peter Zijlstra,
	Srinivas Pandruvada, Len Brown, Dietmar Eggemann,
	Morten Rasmussen, Vincent Guittot, Ricardo Neri, Christian Loehle

On Mon, Nov 18, 2024 at 5:34 PM Pierre Gondois <pierre.gondois@arm.com> wrote:
>
>
>
> On 11/8/24 17:46, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > Modify intel_pstate to register stub EM perf domains for CPUs on
> > hybrid platforms via em_dev_register_perf_domain() and to use
> > em_dev_expand_perf_domain() introduced previously for adding new
> > CPUs to existing EM perf domains when those CPUs become online for
> > the first time after driver initialization.
> >
> > This change is targeting platforms (for example, Lunar Lake) where
> > "small" CPUs (E-cores) are always more energy-efficient than the "big"
> > or "performance" CPUs (P-cores) when run at the same HWP performance
> > level, so it is sufficient to tell the EAS that E-cores are always
> > preferred (so long as there is enough spare capacity on one of them
> > to run the given task).
> >
> > Accordingly, the perf domains are registered per CPU type (that is,
> > all P-cores belong to one perf domain and all E-cores belong to another
> > perf domain) and they are registered only if asymmetric CPU capacity is
> > enabled.  Each perf domain has a one-element states table and that
> > element only contains the relative cost value (the other fields in
> > it are not initialized, so they are all equal to zero), and the cost
> > value for the E-core perf domain is lower.
> >
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >   drivers/cpufreq/intel_pstate.c |  110 ++++++++++++++++++++++++++++++++++++++---
> >   1 file changed, 104 insertions(+), 6 deletions(-)
> >
> > Index: linux-pm/drivers/cpufreq/intel_pstate.c
> > ===================================================================
> > --- linux-pm.orig/drivers/cpufreq/intel_pstate.c
> > +++ linux-pm/drivers/cpufreq/intel_pstate.c
> > @@ -8,6 +8,7 @@
> >
> >   #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> >
> > +#include <linux/energy_model.h>
> >   #include <linux/kernel.h>
> >   #include <linux/kernel_stat.h>
> >   #include <linux/module.h>
> > @@ -938,6 +939,12 @@ static struct freq_attr *hwp_cpufreq_att
> >       NULL,
> >   };
> >
> > +enum hybrid_cpu_type {
> > +     HYBRID_PCORE = 0,
> > +     HYBRID_ECORE,
> > +     HYBRID_NR_TYPES
> > +};
> > +
> >   static struct cpudata *hybrid_max_perf_cpu __read_mostly;
> >   /*
> >    * Protects hybrid_max_perf_cpu, the capacity_perf fields in struct cpudata,
> > @@ -945,6 +952,86 @@ static struct cpudata *hybrid_max_perf_c
> >    */
> >   static DEFINE_MUTEX(hybrid_capacity_lock);
> >
> > +#ifdef CONFIG_ENERGY_MODEL
> > +struct hybrid_em_perf_domain {
> > +     cpumask_t cpumask;
> > +     struct device *dev;
> > +     struct em_data_callback cb;
> > +};
> > +
> > +static int hybrid_pcore_cost(struct device *dev, unsigned long freq,
> > +                          unsigned long *cost)
> > +{
> > +     /*
> > +      * The number used here needs to be higher than the analogous
> > +      * one in hybrid_ecore_cost() below.  The units and the actual
> > +      * values don't matter.
> > +      */
> > +     *cost = 2;
> > +     return 0;
> > +}
> > +
> > +static int hybrid_ecore_cost(struct device *dev, unsigned long freq,
> > +                          unsigned long *cost)
> > +{
> > +     *cost = 1;
> > +     return 0;
> > +}
>
> The artificial EM was introduced for CPPC based platforms since these platforms
> only provide an 'efficiency class' entry to describe the relative efficiency
> of CPUs. The case seems similar to yours.

It is, but I don't particularly like the CPPC driver's approach to this.

> 'Fake' OPPs were created to have an incentive for EAS to balance the load on
> the CPUs in one perf. domain. Indeed, in feec(), during the energy
> computation of a pd, if the cost is independent from the max_util value,
> then one CPU in the pd could end up having a high util, and another CPU a
> NULL util.

Isn't this a consequence of disabling load balancing by EAS?

> For CPPC platforms, this was problematic as a lower OPP could have been
> selected for the same load, so energy was lost for no reason.
>
> In your case it seems that the OPP selection is done independently on each
> CPU. However I assume it is still more energy efficient to have 2 CPUs
> loaded at 50% than one CPU loaded at 100% and an idle CPU.

Maybe.

It really depends on the cost of the idle state etc.

> Also as Dietmar suggested, maybe it would make sense to have some
> way to prefer an CPU with a "energy saving" HFI configuration than
> a similar CPU with a "performance" HFI configuration.

As it happens, E-cores have higher energy-efficiency scores in HFI AFAICS.

> Also, out of curiosity, do you have energy numbers to share ?

Not yet, but there will be some going forward.

Thanks!

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC][PATCH v0.1 5/6] sched/topology: Allow .setpolicy() cpufreq drivers to enable EAS
  2024-11-11 13:54     ` Rafael J. Wysocki
  2024-11-19 15:13       ` Vincent Guittot
@ 2024-11-19 17:37       ` Peter Zijlstra
  2024-11-19 19:28         ` Rafael J. Wysocki
  1 sibling, 1 reply; 22+ messages in thread
From: Peter Zijlstra @ 2024-11-19 17:37 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Christian Loehle, Rafael J. Wysocki, Linux PM, LKML, Lukasz Luba,
	Srinivas Pandruvada, Len Brown, Dietmar Eggemann,
	Morten Rasmussen, Vincent Guittot, Ricardo Neri

On Mon, Nov 11, 2024 at 02:54:43PM +0100, Rafael J. Wysocki wrote:

> Or what about ondemand?  Is it alway completely broken with EAS?

I thought that thing was mostly considered broken anyway :-)

> > For plain (non-intel_pstate) powersave and performance we could replace
> > sugov_effective_cpu_perf()
> > that determines the OPP of the perf-domain by the OPP they will be
> > choosing, but for the rest?
> 
> I generally think that depending on schedutil for EAS is a mistake.

Well, the thinking was that we wanted to move to a single governor, and
not proliferate things.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC][PATCH v0.1 5/6] sched/topology: Allow .setpolicy() cpufreq drivers to enable EAS
  2024-11-19 17:37       ` Peter Zijlstra
@ 2024-11-19 19:28         ` Rafael J. Wysocki
  0 siblings, 0 replies; 22+ messages in thread
From: Rafael J. Wysocki @ 2024-11-19 19:28 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Rafael J. Wysocki, Christian Loehle, Rafael J. Wysocki, Linux PM,
	LKML, Lukasz Luba, Srinivas Pandruvada, Len Brown,
	Dietmar Eggemann, Morten Rasmussen, Vincent Guittot, Ricardo Neri

On Tue, Nov 19, 2024 at 6:37 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Mon, Nov 11, 2024 at 02:54:43PM +0100, Rafael J. Wysocki wrote:
>
> > Or what about ondemand?  Is it alway completely broken with EAS?
>
> I thought that thing was mostly considered broken anyway :-)

Well, it's still there in the tree, although honestly I don't know how
many users of it there are.

> > > For plain (non-intel_pstate) powersave and performance we could replace
> > > sugov_effective_cpu_perf()
> > > that determines the OPP of the perf-domain by the OPP they will be
> > > choosing, but for the rest?
> >
> > I generally think that depending on schedutil for EAS is a mistake.
>
> Well, the thinking was that we wanted to move to a single governor, and
> not proliferate things.

Thing is, intel_pstate in its default configuration doesn't use a
separate cpufreq governor at all.  It allows P-code to select
P-states, but on the new HW the result of this isn't really that much
different from what schedutil would do.

The cpufreq governor check needs to be adjusted at least for this
case, but overall it should be done in cpufreq because it refers to
cpufreq internals.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC][PATCH v0.1 6/6] cpufreq: intel_pstate: Add basic EAS support on hybrid platforms
  2024-11-19 17:20     ` Rafael J. Wysocki
@ 2024-12-16 15:32       ` Pierre Gondois
  2024-12-16 17:25         ` Rafael J. Wysocki
  0 siblings, 1 reply; 22+ messages in thread
From: Pierre Gondois @ 2024-12-16 15:32 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael J. Wysocki, Linux PM, LKML, Lukasz Luba, Peter Zijlstra,
	Srinivas Pandruvada, Len Brown, Dietmar Eggemann,
	Morten Rasmussen, Vincent Guittot, Ricardo Neri, Christian Loehle



On 11/19/24 18:20, Rafael J. Wysocki wrote:
> On Mon, Nov 18, 2024 at 5:34 PM Pierre Gondois <pierre.gondois@arm.com> wrote:
>>
>>
>>
>> On 11/8/24 17:46, Rafael J. Wysocki wrote:
>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>
>>> Modify intel_pstate to register stub EM perf domains for CPUs on
>>> hybrid platforms via em_dev_register_perf_domain() and to use
>>> em_dev_expand_perf_domain() introduced previously for adding new
>>> CPUs to existing EM perf domains when those CPUs become online for
>>> the first time after driver initialization.
>>>
>>> This change is targeting platforms (for example, Lunar Lake) where
>>> "small" CPUs (E-cores) are always more energy-efficient than the "big"
>>> or "performance" CPUs (P-cores) when run at the same HWP performance
>>> level, so it is sufficient to tell the EAS that E-cores are always
>>> preferred (so long as there is enough spare capacity on one of them
>>> to run the given task).
>>>
>>> Accordingly, the perf domains are registered per CPU type (that is,
>>> all P-cores belong to one perf domain and all E-cores belong to another
>>> perf domain) and they are registered only if asymmetric CPU capacity is
>>> enabled.  Each perf domain has a one-element states table and that
>>> element only contains the relative cost value (the other fields in
>>> it are not initialized, so they are all equal to zero), and the cost
>>> value for the E-core perf domain is lower.
>>>
>>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>> ---
>>>    drivers/cpufreq/intel_pstate.c |  110 ++++++++++++++++++++++++++++++++++++++---
>>>    1 file changed, 104 insertions(+), 6 deletions(-)
>>>
>>> Index: linux-pm/drivers/cpufreq/intel_pstate.c
>>> ===================================================================
>>> --- linux-pm.orig/drivers/cpufreq/intel_pstate.c
>>> +++ linux-pm/drivers/cpufreq/intel_pstate.c
>>> @@ -8,6 +8,7 @@
>>>
>>>    #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>>>
>>> +#include <linux/energy_model.h>
>>>    #include <linux/kernel.h>
>>>    #include <linux/kernel_stat.h>
>>>    #include <linux/module.h>
>>> @@ -938,6 +939,12 @@ static struct freq_attr *hwp_cpufreq_att
>>>        NULL,
>>>    };
>>>
>>> +enum hybrid_cpu_type {
>>> +     HYBRID_PCORE = 0,
>>> +     HYBRID_ECORE,
>>> +     HYBRID_NR_TYPES
>>> +};
>>> +
>>>    static struct cpudata *hybrid_max_perf_cpu __read_mostly;
>>>    /*
>>>     * Protects hybrid_max_perf_cpu, the capacity_perf fields in struct cpudata,
>>> @@ -945,6 +952,86 @@ static struct cpudata *hybrid_max_perf_c
>>>     */
>>>    static DEFINE_MUTEX(hybrid_capacity_lock);
>>>
>>> +#ifdef CONFIG_ENERGY_MODEL
>>> +struct hybrid_em_perf_domain {
>>> +     cpumask_t cpumask;
>>> +     struct device *dev;
>>> +     struct em_data_callback cb;
>>> +};
>>> +
>>> +static int hybrid_pcore_cost(struct device *dev, unsigned long freq,
>>> +                          unsigned long *cost)
>>> +{
>>> +     /*
>>> +      * The number used here needs to be higher than the analogous
>>> +      * one in hybrid_ecore_cost() below.  The units and the actual
>>> +      * values don't matter.
>>> +      */
>>> +     *cost = 2;
>>> +     return 0;
>>> +}
>>> +
>>> +static int hybrid_ecore_cost(struct device *dev, unsigned long freq,
>>> +                          unsigned long *cost)
>>> +{
>>> +     *cost = 1;
>>> +     return 0;
>>> +}
>>
>> The artificial EM was introduced for CPPC based platforms since these platforms
>> only provide an 'efficiency class' entry to describe the relative efficiency
>> of CPUs. The case seems similar to yours.
> 
> It is, but I don't particularly like the CPPC driver's approach to this.
> 
>> 'Fake' OPPs were created to have an incentive for EAS to balance the load on
>> the CPUs in one perf. domain. Indeed, in feec(), during the energy
>> computation of a pd, if the cost is independent from the max_util value,
>> then one CPU in the pd could end up having a high util, and another CPU a
>> NULL util.
> 
> Isn't this a consequence of disabling load balancing by EAS?

Yes. Going in that direction, this patch from Vincent should help balancing
the load in your case I think. The patch evaluates other factors when the energy
cost of multiple CPU-candidates is the same.

Meaning, if all CPUs of the same type have only one OPP, the number of tasks
and the the load of the CPUs is then compared. This is not the case currently.
Doing so will help to avoid having one CPU close to being overutilized while
others are idle.

However I think it would still be better to have multiple OPPs in the energy model.
Indeed, it would be closer to reality as I assume that for Intel aswell, there
might be frequency domains and the frequency of the domain is lead by the most
utilized CPU.
This would also avoid hitting corner cases. As if there is one big task and many
small tasks, balancing on the number of tasks per CPU is not the best idea.

https://lore.kernel.org/all/20240830130309.2141697-4-vincent.guittot@linaro.org/

> 
>> For CPPC platforms, this was problematic as a lower OPP could have been
>> selected for the same load, so energy was lost for no reason.
>>
>> In your case it seems that the OPP selection is done independently on each
>> CPU. However I assume it is still more energy efficient to have 2 CPUs
>> loaded at 50% than one CPU loaded at 100% and an idle CPU.
> 
> Maybe.
> 
> It really depends on the cost of the idle state etc.
> 
>> Also as Dietmar suggested, maybe it would make sense to have some
>> way to prefer an CPU with a "energy saving" HFI configuration than
>> a similar CPU with a "performance" HFI configuration.
> 
> As it happens, E-cores have higher energy-efficiency scores in HFI AFAICS.
> 
>> Also, out of curiosity, do you have energy numbers to share ?
> 
> Not yet, but there will be some going forward.
> 
> Thanks!

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC][PATCH v0.1 6/6] cpufreq: intel_pstate: Add basic EAS support on hybrid platforms
  2024-12-16 15:32       ` Pierre Gondois
@ 2024-12-16 17:25         ` Rafael J. Wysocki
  0 siblings, 0 replies; 22+ messages in thread
From: Rafael J. Wysocki @ 2024-12-16 17:25 UTC (permalink / raw)
  To: Pierre Gondois
  Cc: Rafael J. Wysocki, Rafael J. Wysocki, Linux PM, LKML, Lukasz Luba,
	Peter Zijlstra, Srinivas Pandruvada, Len Brown, Dietmar Eggemann,
	Morten Rasmussen, Vincent Guittot, Ricardo Neri, Christian Loehle

On Mon, Dec 16, 2024 at 4:33 PM Pierre Gondois <pierre.gondois@arm.com> wrote:
>
>
>
> On 11/19/24 18:20, Rafael J. Wysocki wrote:
> > On Mon, Nov 18, 2024 at 5:34 PM Pierre Gondois <pierre.gondois@arm.com> wrote:
> >>
> >>
> >>
> >> On 11/8/24 17:46, Rafael J. Wysocki wrote:
> >>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >>>
> >>> Modify intel_pstate to register stub EM perf domains for CPUs on
> >>> hybrid platforms via em_dev_register_perf_domain() and to use
> >>> em_dev_expand_perf_domain() introduced previously for adding new
> >>> CPUs to existing EM perf domains when those CPUs become online for
> >>> the first time after driver initialization.
> >>>
> >>> This change is targeting platforms (for example, Lunar Lake) where
> >>> "small" CPUs (E-cores) are always more energy-efficient than the "big"
> >>> or "performance" CPUs (P-cores) when run at the same HWP performance
> >>> level, so it is sufficient to tell the EAS that E-cores are always
> >>> preferred (so long as there is enough spare capacity on one of them
> >>> to run the given task).
> >>>
> >>> Accordingly, the perf domains are registered per CPU type (that is,
> >>> all P-cores belong to one perf domain and all E-cores belong to another
> >>> perf domain) and they are registered only if asymmetric CPU capacity is
> >>> enabled.  Each perf domain has a one-element states table and that
> >>> element only contains the relative cost value (the other fields in
> >>> it are not initialized, so they are all equal to zero), and the cost
> >>> value for the E-core perf domain is lower.
> >>>
> >>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >>> ---
> >>>    drivers/cpufreq/intel_pstate.c |  110 ++++++++++++++++++++++++++++++++++++++---
> >>>    1 file changed, 104 insertions(+), 6 deletions(-)
> >>>
> >>> Index: linux-pm/drivers/cpufreq/intel_pstate.c
> >>> ===================================================================
> >>> --- linux-pm.orig/drivers/cpufreq/intel_pstate.c
> >>> +++ linux-pm/drivers/cpufreq/intel_pstate.c
> >>> @@ -8,6 +8,7 @@
> >>>
> >>>    #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> >>>
> >>> +#include <linux/energy_model.h>
> >>>    #include <linux/kernel.h>
> >>>    #include <linux/kernel_stat.h>
> >>>    #include <linux/module.h>
> >>> @@ -938,6 +939,12 @@ static struct freq_attr *hwp_cpufreq_att
> >>>        NULL,
> >>>    };
> >>>
> >>> +enum hybrid_cpu_type {
> >>> +     HYBRID_PCORE = 0,
> >>> +     HYBRID_ECORE,
> >>> +     HYBRID_NR_TYPES
> >>> +};
> >>> +
> >>>    static struct cpudata *hybrid_max_perf_cpu __read_mostly;
> >>>    /*
> >>>     * Protects hybrid_max_perf_cpu, the capacity_perf fields in struct cpudata,
> >>> @@ -945,6 +952,86 @@ static struct cpudata *hybrid_max_perf_c
> >>>     */
> >>>    static DEFINE_MUTEX(hybrid_capacity_lock);
> >>>
> >>> +#ifdef CONFIG_ENERGY_MODEL
> >>> +struct hybrid_em_perf_domain {
> >>> +     cpumask_t cpumask;
> >>> +     struct device *dev;
> >>> +     struct em_data_callback cb;
> >>> +};
> >>> +
> >>> +static int hybrid_pcore_cost(struct device *dev, unsigned long freq,
> >>> +                          unsigned long *cost)
> >>> +{
> >>> +     /*
> >>> +      * The number used here needs to be higher than the analogous
> >>> +      * one in hybrid_ecore_cost() below.  The units and the actual
> >>> +      * values don't matter.
> >>> +      */
> >>> +     *cost = 2;
> >>> +     return 0;
> >>> +}
> >>> +
> >>> +static int hybrid_ecore_cost(struct device *dev, unsigned long freq,
> >>> +                          unsigned long *cost)
> >>> +{
> >>> +     *cost = 1;
> >>> +     return 0;
> >>> +}
> >>
> >> The artificial EM was introduced for CPPC based platforms since these platforms
> >> only provide an 'efficiency class' entry to describe the relative efficiency
> >> of CPUs. The case seems similar to yours.
> >
> > It is, but I don't particularly like the CPPC driver's approach to this.
> >
> >> 'Fake' OPPs were created to have an incentive for EAS to balance the load on
> >> the CPUs in one perf. domain. Indeed, in feec(), during the energy
> >> computation of a pd, if the cost is independent from the max_util value,
> >> then one CPU in the pd could end up having a high util, and another CPU a
> >> NULL util.
> >
> > Isn't this a consequence of disabling load balancing by EAS?
>
> Yes. Going in that direction, this patch from Vincent should help balancing
> the load in your case I think. The patch evaluates other factors when the energy
> cost of multiple CPU-candidates is the same.
>
> Meaning, if all CPUs of the same type have only one OPP, the number of tasks
> and the the load of the CPUs is then compared. This is not the case currently.
> Doing so will help to avoid having one CPU close to being overutilized while
> others are idle.
>
> However I think it would still be better to have multiple OPPs in the energy model.
> Indeed, it would be closer to reality as I assume that for Intel aswell, there
> might be frequency domains and the frequency of the domain is lead by the most
> utilized CPU.

There are a couple of problems with this on my target platforms.

First, it is not actually known what the real OPPs are and how the
coordination works.

For some cores (P-cores mostly) the voltage can be adjusted per-core
and for some others there are coordination domains, but the
coordination there involves idle states (for instance, one core may be
allowed to run at the max turbo frequency when the other ones in the
same domain are in idle states, but not otherwise) and dynamic
balancing (that is, the effective capacity depends on how much energy
is used by the domain over time).

Thus whatever is put into the perf states table will be way off most
of the time and there isn't even a good way to choose the numbers to
put in there.  Using the entire range of HWP P-states for that would
be completely impractical IMV because it would only increase overhead
for no real benefit.  Either it would need to be done per-CPU, which
doesn't really make sense because CPUs of the same type really share
the same cost-performance curve, or the assumption that the entire
domain is led by the most utilized CPU would need to be lifted to some
extent.  That would require some more intrusive changes in EAS which
I'd rather avoid unless the simplest approach doesn't work.

The second problem is that the current platforms are much smaller than
what we're expecting to see in the future and whatever is done today
needs to scale.

Also, I really wouldn't want to have to register special perf domains
for favored cores that share the cost-performance curve with the other
cores of the same type except that they can turbo-up higher if
everyone else is idle or similar.

> This would also avoid hitting corner cases. As if there is one big task and many
> small tasks, balancing on the number of tasks per CPU is not the best idea.
>
> https://lore.kernel.org/all/20240830130309.2141697-4-vincent.guittot@linaro.org/

Understood.

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2024-12-16 17:25 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-08 16:09 [RFC][PATCH v0.1 0/6] cpufreq: intel_pstate: Enable EAS on hybrid platforms without SMT Rafael J. Wysocki
2024-11-08 16:36 ` [RFC][PATCH v0.1 1/6] PM: EM: Move perf rebuilding function from schedutil to EM Rafael J. Wysocki
2024-11-08 16:37 ` [RFC][PATCH v0.1 2/6] PM: EM: Call em_compute_costs() from em_create_perf_table() Rafael J. Wysocki
2024-11-12  8:21   ` Dietmar Eggemann
2024-11-08 16:38 ` [RFC][PATCH v0.1 3/6] PM: EM: Add special case to em_dev_register_perf_domain() Rafael J. Wysocki
2024-11-12  8:21   ` Dietmar Eggemann
2024-11-18 15:24   ` Hongyan Xia
2024-11-19 13:51     ` Rafael J. Wysocki
2024-11-08 16:40 ` [RFC][PATCH v0.1 4/6] PM: EM: Introduce em_dev_expand_perf_domain() Rafael J. Wysocki
2024-11-08 16:41 ` [RFC][PATCH v0.1 5/6] sched/topology: Allow .setpolicy() cpufreq drivers to enable EAS Rafael J. Wysocki
2024-11-11 11:54   ` Christian Loehle
2024-11-11 13:54     ` Rafael J. Wysocki
2024-11-19 15:13       ` Vincent Guittot
2024-11-19 17:37       ` Peter Zijlstra
2024-11-19 19:28         ` Rafael J. Wysocki
2024-11-08 16:46 ` [RFC][PATCH v0.1 6/6] cpufreq: intel_pstate: Add basic EAS support on hybrid platforms Rafael J. Wysocki
2024-11-12  8:21   ` Dietmar Eggemann
2024-11-19 14:38     ` Rafael J. Wysocki
2024-11-18 16:34   ` Pierre Gondois
2024-11-19 17:20     ` Rafael J. Wysocki
2024-12-16 15:32       ` Pierre Gondois
2024-12-16 17:25         ` Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox