Linux Power Management development
 help / color / mirror / Atom feed
From: Sumit Gupta <sumitg@nvidia.com>
To: <rafael@kernel.org>, <viresh.kumar@linaro.org>,
	<pierre.gondois@arm.com>, <ionela.voinescu@arm.com>,
	<zhenglifeng1@huawei.com>, <zhanjie9@hisilicon.com>,
	<linux-kernel@vger.kernel.org>, <linux-pm@vger.kernel.org>,
	<linux-tegra@vger.kernel.org>
Cc: <treding@nvidia.com>, <jonathanh@nvidia.com>, <vsethi@nvidia.com>,
	<ksitaraman@nvidia.com>, <sanjayc@nvidia.com>, <mochs@nvidia.com>,
	<bbasu@nvidia.com>, <sumitg@nvidia.com>
Subject: [PATCH] cpufreq: CPPC: Preserve OSPM-set registers across hotplug and unload
Date: Tue, 23 Jun 2026 15:24:03 +0530	[thread overview]
Message-ID: <20260623095403.3407436-1-sumitg@nvidia.com> (raw)

Values written to OSPM-set CPPC registers (via sysfs or the autonomous
boot parameter) can be lost in two ways:

  - Across CPU hotplug: the platform may reset a CPU's registers when it
    is offlined.
  - On driver unload: the value the driver wrote is left in the register
    instead of returning to its pre-driver state.

Add a small table-driven mechanism that handles both:

  - Capture each register's firmware value when a CPU is first seen and
    restore it on driver unload.
  - Record the last value the driver set and reapply it from ->init()
    when the policy is reactivated after CPU hotplug.

The firmware value is captured on a CPU's first activation rather than
once at module load, so CPUs offline at boot or hot-added later are
covered.

Reapply is only needed on a full policy teardown and bring-up, which goes
through ->init(). In a SHARED_TYPE_ANY policy, offlining a single CPU
leaves the shared register untouched, so nothing is lost there.

Cover the OSPM Nominal Performance, Autonomous Selection
(auto_sel) and Energy Performance Preference (EPP) registers. For
auto_sel it replaces the previous unconditional
cppc_set_auto_sel(cpu, false) on unload with a restore of the firmware
value captured at the CPU's first init.

Suggested-by: Pierre Gondois <pierre.gondois@arm.com>
Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
---

This applies on top of (not yet merged):
 [1] ACPI: CPPC: Add ospm_nominal_perf support
     https://lore.kernel.org/lkml/20260615185934.2383514-1-sumitg@nvidia.com/
 [2] cpufreq: CPPC: add autonomous mode boot parameter support
     https://lore.kernel.org/lkml/20260623080652.3353386-1-sumitg@nvidia.com/

 drivers/cpufreq/cppc_cpufreq.c | 194 +++++++++++++++++++++++++++++++--
 1 file changed, 186 insertions(+), 8 deletions(-)

diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
index a3fabfb07fbe..d6ea2cbde187 100644
--- a/drivers/cpufreq/cppc_cpufreq.c
+++ b/drivers/cpufreq/cppc_cpufreq.c
@@ -28,6 +28,174 @@
 
 static struct cpufreq_driver cppc_cpufreq_driver;
 
+/*
+ * OSPM-set CPPC registers tracked for save/restore. A value set via sysfs or
+ * the autonomous boot parameter is reapplied across CPU hotplug, and the
+ * firmware value is restored on driver unload.
+ */
+enum cppc_saved_reg_id {
+	CPPC_SAVED_OSPM_NOMINAL_PERF,
+	CPPC_SAVED_AUTO_SEL,
+	CPPC_SAVED_EPP,
+	CPPC_NR_SAVED_REGS,
+};
+
+struct cppc_saved_reg {
+	int (*get)(int cpu, u64 *val);
+	int (*set)(int cpu, u64 val);
+};
+
+/* u64 wrappers so the bool auto_sel register fits the table signatures. */
+static int cppc_get_auto_sel_u64(int cpu, u64 *val)
+{
+	bool enable;
+	int ret;
+
+	ret = cppc_get_auto_sel(cpu, &enable);
+	if (ret)
+		return ret;
+
+	*val = enable;
+	return 0;
+}
+
+static int cppc_set_auto_sel_u64(int cpu, u64 val)
+{
+	return cppc_set_auto_sel(cpu, !!val);
+}
+
+static const struct cppc_saved_reg cppc_saved_regs[CPPC_NR_SAVED_REGS] = {
+	[CPPC_SAVED_OSPM_NOMINAL_PERF] = {
+		cppc_get_ospm_nominal_perf, cppc_set_ospm_nominal_perf,
+	},
+	[CPPC_SAVED_AUTO_SEL] = {
+		cppc_get_auto_sel_u64, cppc_set_auto_sel_u64,
+	},
+	[CPPC_SAVED_EPP] = {
+		cppc_get_epp_perf, cppc_set_epp,
+	},
+};
+
+/*
+ * Per-CPU saved state for each register in cppc_saved_regs[]:
+ *   firmware_val      - register value before the driver touched it, restored
+ *                       on unload
+ *   requested_val     - last value the driver set (sysfs or boot parameter),
+ *                       reapplied on policy reactivation
+ *   firmware_captured - whether firmware_val has been read, so a not-yet-seen
+ *                       CPU isn't mistaken for one whose firmware value is 0
+ */
+struct cppc_saved_state {
+	u64 firmware_val;
+	u64 requested_val;
+	bool firmware_captured;
+};
+
+/*
+ * Per-CPU and not tied to a policy, so the saved values survive policy
+ * teardown/bring-up across CPU hotplug. cpu_data->perf_ctrls is per-policy
+ * and freed on policy ->exit.
+ */
+static DEFINE_PER_CPU(struct cppc_saved_state[CPPC_NR_SAVED_REGS], cppc_saved_state);
+
+static void cppc_cache_perf_ctrls(struct cppc_cpudata *cpu_data,
+				  enum cppc_saved_reg_id reg, u64 val)
+{
+	switch (reg) {
+	case CPPC_SAVED_AUTO_SEL:
+		cpu_data->perf_ctrls.auto_sel = val;
+		break;
+	case CPPC_SAVED_EPP:
+		cpu_data->perf_ctrls.energy_perf = val;
+		break;
+	default:
+		break;
+	}
+}
+
+/*
+ * Save the requested value for the given register and CPU, to be reapplied when
+ * the policy is reactivated after CPU hotplug. Also update the per-policy
+ * perf_ctrls copy so the saved and current values stay in sync.
+ */
+static void cppc_save_requested(struct cppc_cpudata *cpu_data, unsigned int cpu,
+				enum cppc_saved_reg_id reg, u64 val)
+{
+	per_cpu(cppc_saved_state, cpu)[reg].requested_val = val;
+	cppc_cache_perf_ctrls(cpu_data, reg, val);
+}
+
+/*
+ * Reapply each register's last requested value from ->init(), so a value set
+ * via sysfs or the boot parameter survives a policy teardown and bring-up
+ * across CPU hotplug. Also keep perf_ctrls in sync with it.
+ */
+static void cppc_cpufreq_reapply_requested_regs(struct cpufreq_policy *policy)
+{
+	struct cppc_cpudata *cpu_data = policy->driver_data;
+	unsigned int cpu, i;
+	u64 val;
+
+	for_each_cpu(cpu, policy->cpus) {
+		for (i = 0; i < CPPC_NR_SAVED_REGS; i++) {
+			val = per_cpu(cppc_saved_state, cpu)[i].requested_val;
+			if (val == U64_MAX)
+				continue;
+
+			cppc_saved_regs[i].set(cpu, val);
+
+			/* Keep perf_ctrls in sync via the policy's CPU. */
+			if (cpu == policy->cpu)
+				cppc_cache_perf_ctrls(cpu_data, i, val);
+		}
+	}
+}
+
+/*
+ * On a CPU's first ->init(), capture each register's firmware value to be
+ * restored on driver unload. Later calls for the same CPU are a no-op. Capturing
+ * from ->init() rather than module load covers CPUs that appear later. Also seed
+ * requested_val to U64_MAX so its zeroed default is not taken as a request for 0.
+ */
+static void cppc_cpufreq_save_firmware_regs(struct cpufreq_policy *policy)
+{
+	unsigned int cpu, i;
+	u64 val;
+
+	for_each_cpu(cpu, policy->cpus) {
+		for (i = 0; i < CPPC_NR_SAVED_REGS; i++) {
+			struct cppc_saved_state *s =
+				&per_cpu(cppc_saved_state, cpu)[i];
+
+			/* Capture once per CPU; skip if already recorded. */
+			if (s->firmware_captured)
+				continue;
+
+			if (cppc_saved_regs[i].get(cpu, &val))
+				val = U64_MAX;
+			s->firmware_val = val;
+			s->requested_val = U64_MAX;
+			s->firmware_captured = true;
+		}
+	}
+}
+
+/* On driver unload, restore each captured CPU's firmware value. */
+static void cppc_cpufreq_restore_firmware_regs(void)
+{
+	unsigned int cpu, i;
+
+	for_each_present_cpu(cpu) {
+		for (i = 0; i < CPPC_NR_SAVED_REGS; i++) {
+			struct cppc_saved_state *s =
+				&per_cpu(cppc_saved_state, cpu)[i];
+
+			if (s->firmware_captured && s->firmware_val != U64_MAX)
+				cppc_saved_regs[i].set(cpu, s->firmware_val);
+		}
+	}
+}
+
 /* Autonomous Selection boot parameter modes */
 enum {
 	AUTO_SEL_DISABLED = 0,
@@ -766,6 +934,9 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
 	policy->cur = cppc_perf_to_khz(caps, caps->highest_perf);
 	cpu_data->perf_ctrls.desired_perf =  caps->highest_perf;
 
+	/* Capture a CPU's firmware values on its first init, before any driver write. */
+	cppc_cpufreq_save_firmware_regs(policy);
+
 	/*
 	 * Enable autonomous mode on first init if boot param is set.
 	 * Check last_governor to detect first init and skip if auto_sel
@@ -812,7 +983,7 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
 			if (ret && ret != -EOPNOTSUPP)
 				pr_warn("Failed to set EPP for CPU%d (%d)\n", cpu, ret);
 			else if (!ret)
-				cpu_data->perf_ctrls.energy_perf = epp;
+				cppc_save_requested(cpu_data, cpu, CPPC_SAVED_EPP, epp);
 		}
 
 		/* Program min/max/desired into CPPC regs (non-fatal on failure). */
@@ -826,7 +997,7 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
 			pr_warn("auto_sel CPU%d failed (%d); using OS mode\n",
 				cpu, ret);
 		else if (!ret)
-			cpu_data->perf_ctrls.auto_sel = true;
+			cppc_save_requested(cpu_data, cpu, CPPC_SAVED_AUTO_SEL, true);
 	}
 
 	if (cpu_data->perf_ctrls.auto_sel) {
@@ -850,6 +1021,10 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
 	}
 
 	cppc_cpufreq_cpu_fie_init(policy);
+
+	/* Reapply any saved values lost across a full policy teardown. */
+	cppc_cpufreq_reapply_requested_regs(policy);
+
 	return 0;
 
 out:
@@ -1039,6 +1214,8 @@ static ssize_t store_auto_select(struct cpufreq_policy *policy,
 		}
 	}
 
+	cppc_save_requested(cpu_data, policy->cpu, CPPC_SAVED_AUTO_SEL, val);
+
 	return count;
 }
 
@@ -1111,7 +1288,7 @@ store_energy_performance_preference_val(struct cpufreq_policy *policy,
 	if (ret)
 		return ret;
 
-	cpu_data->perf_ctrls.energy_perf = val;
+	cppc_save_requested(cpu_data, policy->cpu, CPPC_SAVED_EPP, val);
 
 	return count;
 }
@@ -1193,6 +1370,9 @@ static ssize_t store_ospm_nominal_freq(struct cpufreq_policy *policy,
 		}
 	}
 
+	for_each_cpu(sib, policy->cpus)
+		cppc_save_requested(cpu_data, sib, CPPC_SAVED_OSPM_NOMINAL_PERF, perf);
+
 	return count;
 
 rollback:
@@ -1258,13 +1438,11 @@ static int __init cppc_cpufreq_init(void)
 
 static void __exit cppc_cpufreq_exit(void)
 {
-	unsigned int cpu;
-
-	for_each_present_cpu(cpu)
-		cppc_set_auto_sel(cpu, false);
-
 	cpufreq_unregister_driver(&cppc_cpufreq_driver);
 	cppc_freq_invariance_exit();
+
+	/* Restore auto_sel and the other saved registers to their firmware value. */
+	cppc_cpufreq_restore_firmware_regs();
 }
 
 module_param_cb(auto_sel_mode, &auto_sel_mode_ops, &auto_sel_mode, 0444);
-- 
2.34.1


             reply	other threads:[~2026-06-23  9:54 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-23  9:54 Sumit Gupta [this message]
2026-06-23 11:00 ` [PATCH] cpufreq: CPPC: Preserve OSPM-set registers across hotplug and unload Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260623095403.3407436-1-sumitg@nvidia.com \
    --to=sumitg@nvidia.com \
    --cc=bbasu@nvidia.com \
    --cc=ionela.voinescu@arm.com \
    --cc=jonathanh@nvidia.com \
    --cc=ksitaraman@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=linux-tegra@vger.kernel.org \
    --cc=mochs@nvidia.com \
    --cc=pierre.gondois@arm.com \
    --cc=rafael@kernel.org \
    --cc=sanjayc@nvidia.com \
    --cc=treding@nvidia.com \
    --cc=viresh.kumar@linaro.org \
    --cc=vsethi@nvidia.com \
    --cc=zhanjie9@hisilicon.com \
    --cc=zhenglifeng1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox