linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] cpufreq: CPPC: Rework FIE warning prints and cppc_scale_freq_tick()
@ 2025-08-28 11:02 Jie Zhan
  2025-08-28 11:02 ` [PATCH v2 1/2] cpufreq: CPPC: Don't warn if FIE init fails to read counters Jie Zhan
  2025-08-28 11:02 ` [PATCH v2 2/2] cpufreq: CPPC: Update FIE arch_freq_scale in ticks for non-PCC regs Jie Zhan
  0 siblings, 2 replies; 3+ messages in thread
From: Jie Zhan @ 2025-08-28 11:02 UTC (permalink / raw)
  To: viresh.kumar, rafael, ionela.voinescu, beata.michalska,
	zhenglifeng1
  Cc: linux-pm, linux-acpi, linux-arm-kernel, linuxarm, zhanjie9,
	jonathan.cameron

Reading performance feedback counters on offline or low-power idle CPUs may
return 0, which is interpreted as -EFAULT.

This leads to two issues related to the CPPC FIE:

1. When booting a subset of CPUs in policy->related_cpus (some CPUs under
the cpufreq policy is offline), there are warnings of "failed to read perf
counters for cpu" during the CPPC FIE initialization.

2. On our platform with the CPC regs in System Memory and a power-down idle
state enabled, if the CPPC FIE is registered successfully, there are
repeated warnings of "failed to read perf counters" because
cppc_scale_freq_workfn() is trying to access the counters of remote CPUs
that enters the idle state.

To solve the above issues:

Patch 1 removes the warning when the CPPC FIE initialization fails to read
counters on offline CPUs and changes the log leve to debug.  This can be
applied separately.

Patch 2 moves the update of FIE arch_freq_scale into ticks for non-PCC regs
and maintains the existing mechanism for PCC regs, such that reading
counters would be triggered on the local CPU only.  This inherently solves
the issue in [1].  We have tested this on Kunpeng SoCs with the CPC regs
both in System Memory and FFH.  More tests on other platforms are welcome
though.
[1] https://lore.kernel.org/linux-pm/20250730032312.167062-3-yubowen8@huawei.com/

Changelog:

v2:
- Update the cover letter and the commit log based on v1 discussion
- Update FIE arch_freq_scale in ticks for non-PCC regs

v1:
https://lore.kernel.org/linux-pm/20250730032312.167062-1-yubowen8@huawei.com/

Jie Zhan (2):
  cpufreq: CPPC: Don't warn if FIE init fails to read counters
  cpufreq: CPPC: Update FIE arch_freq_scale in ticks for non-PCC regs

 drivers/cpufreq/cppc_cpufreq.c | 64 +++++++++++++++++++++-------------
 1 file changed, 39 insertions(+), 25 deletions(-)

-- 
2.33.0


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH v2 1/2] cpufreq: CPPC: Don't warn if FIE init fails to read counters
  2025-08-28 11:02 [PATCH v2 0/2] cpufreq: CPPC: Rework FIE warning prints and cppc_scale_freq_tick() Jie Zhan
@ 2025-08-28 11:02 ` Jie Zhan
  2025-08-28 11:02 ` [PATCH v2 2/2] cpufreq: CPPC: Update FIE arch_freq_scale in ticks for non-PCC regs Jie Zhan
  1 sibling, 0 replies; 3+ messages in thread
From: Jie Zhan @ 2025-08-28 11:02 UTC (permalink / raw)
  To: viresh.kumar, rafael, ionela.voinescu, beata.michalska,
	zhenglifeng1
  Cc: linux-pm, linux-acpi, linux-arm-kernel, linuxarm, zhanjie9,
	jonathan.cameron

During the CPPC FIE initialization, reading perf counters on offline cpus
should be expected to fail.  Don't warn on this case.

Also, change the error log level to debug since FIE is optional.

Co-developed-by: Bowen Yu <yubowen8@huawei.com>
Signed-off-by: Bowen Yu <yubowen8@huawei.com> # Changing loglevel to debug
Signed-off-by: Jie Zhan <zhanjie9@hisilicon.com>
---
 drivers/cpufreq/cppc_cpufreq.c | 12 +++---------
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
index 4a17162a392d..7724318b3415 100644
--- a/drivers/cpufreq/cppc_cpufreq.c
+++ b/drivers/cpufreq/cppc_cpufreq.c
@@ -144,16 +144,10 @@ static void cppc_cpufreq_cpu_fie_init(struct cpufreq_policy *policy)
 		init_irq_work(&cppc_fi->irq_work, cppc_irq_work);
 
 		ret = cppc_get_perf_ctrs(cpu, &cppc_fi->prev_perf_fb_ctrs);
-		if (ret) {
-			pr_warn("%s: failed to read perf counters for cpu:%d: %d\n",
+		if (ret && cpu_online(cpu)) {
+			pr_debug("%s: failed to read perf counters for cpu:%d: %d\n",
 				__func__, cpu, ret);
-
-			/*
-			 * Don't abort if the CPU was offline while the driver
-			 * was getting registered.
-			 */
-			if (cpu_online(cpu))
-				return;
+			return;
 		}
 	}
 
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH v2 2/2] cpufreq: CPPC: Update FIE arch_freq_scale in ticks for non-PCC regs
  2025-08-28 11:02 [PATCH v2 0/2] cpufreq: CPPC: Rework FIE warning prints and cppc_scale_freq_tick() Jie Zhan
  2025-08-28 11:02 ` [PATCH v2 1/2] cpufreq: CPPC: Don't warn if FIE init fails to read counters Jie Zhan
@ 2025-08-28 11:02 ` Jie Zhan
  1 sibling, 0 replies; 3+ messages in thread
From: Jie Zhan @ 2025-08-28 11:02 UTC (permalink / raw)
  To: viresh.kumar, rafael, ionela.voinescu, beata.michalska,
	zhenglifeng1
  Cc: linux-pm, linux-acpi, linux-arm-kernel, linuxarm, zhanjie9,
	jonathan.cameron

Currently, the CPPC Frequency Invariance Engine (FIE) is invoked from the
scheduler tick but defers the update of arch_freq_scale to a separate
thread because cppc_get_perf_ctrs() would sleep if the CPC regs are in PCC.

However, this deferred update mechanism is unnecessary and introduces extra
overhead for non-PCC register spaces (e.g. System Memory or FFH), where
accessing the regs won't sleep and can be safely performed from the tick
context.  Also, reading perf counters of a remote CPU may return 0 if it's
in a low-power idle state, e.g. power down or reset.

Update arch_freq_scale directly in ticks for non-PCC regs and keep the
deferred update mechanism for PCC regs.

Signed-off-by: Jie Zhan <zhanjie9@hisilicon.com>
---
 drivers/cpufreq/cppc_cpufreq.c | 52 +++++++++++++++++++++++-----------
 1 file changed, 36 insertions(+), 16 deletions(-)

diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
index 7724318b3415..66d74b062ceb 100644
--- a/drivers/cpufreq/cppc_cpufreq.c
+++ b/drivers/cpufreq/cppc_cpufreq.c
@@ -55,31 +55,24 @@ static int cppc_perf_from_fbctrs(struct cppc_cpudata *cpu_data,
 				 struct cppc_perf_fb_ctrs *fb_ctrs_t1);
 
 /**
- * cppc_scale_freq_workfn - CPPC arch_freq_scale updater for frequency invariance
- * @work: The work item.
+ * __cppc_scale_freq_tick - CPPC arch_freq_scale updater for frequency invariance
+ * @cppc_fi: per-cpu CPPC FIE data.
  *
- * The CPPC driver register itself with the topology core to provide its own
+ * The CPPC driver registers itself with the topology core to provide its own
  * implementation (cppc_scale_freq_tick()) of topology_scale_freq_tick() which
  * gets called by the scheduler on every tick.
  *
  * Note that the arch specific counters have higher priority than CPPC counters,
  * if available, though the CPPC driver doesn't need to have any special
  * handling for that.
- *
- * On an invocation of cppc_scale_freq_tick(), we schedule an irq work (since we
- * reach here from hard-irq context), which then schedules a normal work item
- * and cppc_scale_freq_workfn() updates the per_cpu arch_freq_scale variable
- * based on the counter updates since the last tick.
  */
-static void cppc_scale_freq_workfn(struct kthread_work *work)
+static void __cppc_scale_freq_tick(struct cppc_freq_invariance *cppc_fi)
 {
-	struct cppc_freq_invariance *cppc_fi;
 	struct cppc_perf_fb_ctrs fb_ctrs = {0};
 	struct cppc_cpudata *cpu_data;
 	unsigned long local_freq_scale;
 	u64 perf;
 
-	cppc_fi = container_of(work, struct cppc_freq_invariance, work);
 	cpu_data = cppc_fi->cpu_data;
 
 	if (cppc_get_perf_ctrs(cppc_fi->cpu, &fb_ctrs)) {
@@ -104,6 +97,14 @@ static void cppc_scale_freq_workfn(struct kthread_work *work)
 	per_cpu(arch_freq_scale, cppc_fi->cpu) = local_freq_scale;
 }
 
+static void cppc_scale_freq_workfn(struct kthread_work *work)
+{
+	struct cppc_freq_invariance *cppc_fi;
+
+	cppc_fi = container_of(work, struct cppc_freq_invariance, work);
+	__cppc_scale_freq_tick(cppc_fi);
+}
+
 static void cppc_irq_work(struct irq_work *irq_work)
 {
 	struct cppc_freq_invariance *cppc_fi;
@@ -112,7 +113,14 @@ static void cppc_irq_work(struct irq_work *irq_work)
 	kthread_queue_work(kworker_fie, &cppc_fi->work);
 }
 
-static void cppc_scale_freq_tick(void)
+/*
+ * Reading perf counters may sleep if the CPC regs are in PCC.  Thus, we
+ * schedule an irq work in scale_freq_tick (since we reach here from hard-irq
+ * context), which then schedules a normal work item cppc_scale_freq_workfn()
+ * that updates the per_cpu arch_freq_scale variable based on the counter
+ * updates since the last tick.
+ */
+static void cppc_scale_freq_tick_pcc(void)
 {
 	struct cppc_freq_invariance *cppc_fi = &per_cpu(cppc_freq_inv, smp_processor_id());
 
@@ -123,6 +131,11 @@ static void cppc_scale_freq_tick(void)
 	irq_work_queue(&cppc_fi->irq_work);
 }
 
+static void cppc_scale_freq_tick(void)
+{
+	__cppc_scale_freq_tick(&per_cpu(cppc_freq_inv, smp_processor_id()));
+}
+
 static struct scale_freq_data cppc_sftd = {
 	.source = SCALE_FREQ_SOURCE_CPPC,
 	.set_freq_scale = cppc_scale_freq_tick,
@@ -140,8 +153,10 @@ static void cppc_cpufreq_cpu_fie_init(struct cpufreq_policy *policy)
 		cppc_fi = &per_cpu(cppc_freq_inv, cpu);
 		cppc_fi->cpu = cpu;
 		cppc_fi->cpu_data = policy->driver_data;
-		kthread_init_work(&cppc_fi->work, cppc_scale_freq_workfn);
-		init_irq_work(&cppc_fi->irq_work, cppc_irq_work);
+		if (cppc_perf_ctrs_in_pcc()) {
+			kthread_init_work(&cppc_fi->work, cppc_scale_freq_workfn);
+			init_irq_work(&cppc_fi->irq_work, cppc_irq_work);
+		}
 
 		ret = cppc_get_perf_ctrs(cpu, &cppc_fi->prev_perf_fb_ctrs);
 		if (ret && cpu_online(cpu)) {
@@ -174,6 +189,9 @@ static void cppc_cpufreq_cpu_fie_exit(struct cpufreq_policy *policy)
 	/* policy->cpus will be empty here, use related_cpus instead */
 	topology_clear_scale_freq_source(SCALE_FREQ_SOURCE_CPPC, policy->related_cpus);
 
+	if (!cppc_perf_ctrs_in_pcc())
+		return;
+
 	for_each_cpu(cpu, policy->related_cpus) {
 		cppc_fi = &per_cpu(cppc_freq_inv, cpu);
 		irq_work_sync(&cppc_fi->irq_work);
@@ -206,9 +224,11 @@ static void __init cppc_freq_invariance_init(void)
 		}
 	}
 
-	if (fie_disabled)
+	if (fie_disabled || !cppc_perf_ctrs_in_pcc())
 		return;
 
+	cppc_sftd.set_freq_scale = cppc_scale_freq_tick_pcc;
+
 	kworker_fie = kthread_run_worker(0, "cppc_fie");
 	if (IS_ERR(kworker_fie)) {
 		pr_warn("%s: failed to create kworker_fie: %ld\n", __func__,
@@ -228,7 +248,7 @@ static void __init cppc_freq_invariance_init(void)
 
 static void cppc_freq_invariance_exit(void)
 {
-	if (fie_disabled)
+	if (fie_disabled || !cppc_perf_ctrs_in_pcc())
 		return;
 
 	kthread_destroy_worker(kworker_fie);
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-08-28 11:09 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-28 11:02 [PATCH v2 0/2] cpufreq: CPPC: Rework FIE warning prints and cppc_scale_freq_tick() Jie Zhan
2025-08-28 11:02 ` [PATCH v2 1/2] cpufreq: CPPC: Don't warn if FIE init fails to read counters Jie Zhan
2025-08-28 11:02 ` [PATCH v2 2/2] cpufreq: CPPC: Update FIE arch_freq_scale in ticks for non-PCC regs Jie Zhan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).