Linux Power Management development
 help / color / mirror / Atom feed
From: Sumit Gupta <sumitg@nvidia.com>
To: <rafael@kernel.org>, <viresh.kumar@linaro.org>, <lenb@kernel.org>,
	<pierre.gondois@arm.com>, <zhenglifeng1@huawei.com>,
	<zhanjie9@hisilicon.com>, <mario.limonciello@amd.com>,
	<saket.dumbre@intel.com>, <linux-acpi@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>, <linux-pm@vger.kernel.org>,
	<acpica-devel@lists.linux.dev>
Cc: <treding@nvidia.com>, <jonathanh@nvidia.com>, <vsethi@nvidia.com>,
	<ksitaraman@nvidia.com>, <sanjayc@nvidia.com>, <mochs@nvidia.com>,
	<bbasu@nvidia.com>, <sumitg@nvidia.com>
Subject: [PATCH v5] ACPI: CPPC: Add ospm_nominal_perf support
Date: Tue, 16 Jun 2026 00:29:34 +0530	[thread overview]
Message-ID: <20260615185934.2383514-1-sumitg@nvidia.com> (raw)

Expose the OSPM Nominal Performance register (ACPI 6.6, Section
8.4.6.1.2.6), which conveys the desired nominal performance level
at which the platform may run. Unlike the existing read-only
Nominal Performance register, it is writable and lets OSPM
request a lower nominal level than the platform-reported nominal.
The platform classifies performance above this level as boosted
and below as throttled for its power/thermal decisions.

It is exposed as a per-policy cpufreq sysfs attribute in kHz, to
match the cpufreq sysfs unit convention:

  /sys/devices/system/cpu/cpufreq/policyN/ospm_nominal_freq

The attribute is documented in
Documentation/ABI/testing/sysfs-devices-system-cpu.

Writes are converted to perf via cppc_khz_to_perf(), validated
against [Lowest Performance, Nominal Performance], and applied to
every CPU in policy->cpus.

On read, the current register value is returned, or
"<unsupported>" if the platform does not implement the register.

Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
---
Patch 1 of the v4 series ("ACPI: CPPC: Add support for CPPC v4") is
already applied, so this contains only patch 2.

Changes in v5:
- Add cppc_get_ospm_nominal_perf() to read the register directly.
- Drop the cppc_cpudata cache variables ospm_nominal_perf/_set.
- Show_ospm_nominal_freq() returns register value or "<unsupported>".
- Register rollback reads the register too.
- Move range check into the sysfs store from cppc_set_ospm_nominal_perf()
- ABI doc: update read description and add a task-migration note.

v4: https://lore.kernel.org/lkml/20260527194626.185286-1-sumitg@nvidia.com/
v3: https://lore.kernel.org/lkml/20260514194822.1841748-1-sumitg@nvidia.com/
v2: https://lore.kernel.org/lkml/20260430142430.755437-1-sumitg@nvidia.com/
v1: https://lore.kernel.org/lkml/20260427051823.280419-1-sumitg@nvidia.com/

 .../ABI/testing/sysfs-devices-system-cpu      | 26 ++++++++
 drivers/acpi/cppc_acpi.c                      | 32 +++++++++
 drivers/cpufreq/cppc_cpufreq.c                | 65 +++++++++++++++++++
 include/acpi/cppc_acpi.h                      | 10 +++
 4 files changed, 133 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
index 82d10d556cc8..a8d592c08823 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -346,6 +346,32 @@ Description:	Performance Limited
 
 		This file is only present if the cppc-cpufreq driver is in use.
 
+What:		/sys/devices/system/cpu/cpuX/cpufreq/ospm_nominal_freq
+Date:		May 2026
+Contact:	linux-pm@vger.kernel.org
+Description:	OSPM Nominal Performance (kHz)
+
+		OSPM uses this attribute to request a nominal performance
+		level lower than the platform-reported nominal. The
+		platform treats performance above this level as boost
+		and below as throttle for power and thermal decisions.
+
+		Read returns the current value in kHz, or "<unsupported>"
+		if the platform does not implement the register. Write a
+		kHz value in the range [lowest_freq, nominal_freq].
+
+		Note that tasks may be migrated from one CPU to another
+		by the scheduler's load-balancing algorithm, and if
+		different OSPM Nominal Performance values are set for
+		those CPUs (through different cpufreq policies), that may
+		lead to undesirable outcomes. To avoid such issues it is
+		better to set the same value across all policies, or to
+		pin every task potentially sensitive to it to a specific
+		CPU.
+
+		This file is only present if the cppc-cpufreq driver is
+		in use.
+
 What:		/sys/devices/system/cpu/cpu*/cache/index3/cache_disable_{0,1}
 Date:		August 2008
 KernelVersion:	2.6.27
diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
index 9f572f481241..1fcc22a10b4c 100644
--- a/drivers/acpi/cppc_acpi.c
+++ b/drivers/acpi/cppc_acpi.c
@@ -1685,6 +1685,38 @@ int cppc_set_epp(int cpu, u64 epp_val)
 }
 EXPORT_SYMBOL_GPL(cppc_set_epp);
 
+/**
+ * cppc_set_ospm_nominal_perf() - Write OSPM Nominal Performance register.
+ * @cpu: CPU on which to write register.
+ * @ospm_nominal_perf: Value to write to the OSPM Nominal Performance register.
+ *
+ * OSPM Nominal Performance conveys the desired nominal performance level
+ * at which the platform may run. Per ACPI 6.6, s8.4.6.1.2.6, the value
+ * must lie within [Lowest Performance, Nominal Performance] and may be
+ * set independently of Minimum, Maximum and Desired performance. The
+ * caller is responsible for validating the range.
+ *
+ * Return: 0 on success or negative error code.
+ */
+int cppc_set_ospm_nominal_perf(int cpu, u64 ospm_nominal_perf)
+{
+	return cppc_set_reg_val(cpu, OSPM_NOMINAL_PERF, ospm_nominal_perf);
+}
+EXPORT_SYMBOL_GPL(cppc_set_ospm_nominal_perf);
+
+/**
+ * cppc_get_ospm_nominal_perf() - Read OSPM Nominal Performance register.
+ * @cpu: CPU from which to read register.
+ * @ospm_nominal_perf: Pointer to store the OSPM Nominal Performance value.
+ *
+ * Return: 0 on success or negative error code.
+ */
+int cppc_get_ospm_nominal_perf(int cpu, u64 *ospm_nominal_perf)
+{
+	return cppc_get_reg_val(cpu, OSPM_NOMINAL_PERF, ospm_nominal_perf);
+}
+EXPORT_SYMBOL_GPL(cppc_get_ospm_nominal_perf);
+
 /**
  * cppc_get_auto_act_window() - Read autonomous activity window register.
  * @cpu: CPU from which to read register.
diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
index f6cea0c54dd9..d160ceced7d9 100644
--- a/drivers/cpufreq/cppc_cpufreq.c
+++ b/drivers/cpufreq/cppc_cpufreq.c
@@ -1011,11 +1011,75 @@ static int cppc_get_perf_limited_filtered(int cpu, u64 *perf_limited)
 CPPC_CPUFREQ_ATTR_RW_U64(perf_limited, cppc_get_perf_limited_filtered,
 			 cppc_set_perf_limited)
 
+static ssize_t show_ospm_nominal_freq(struct cpufreq_policy *policy, char *buf)
+{
+	struct cppc_cpudata *cpu_data = policy->driver_data;
+	u64 perf;
+	int ret;
+
+	ret = cppc_get_ospm_nominal_perf(policy->cpu, &perf);
+	if (ret == -EOPNOTSUPP)
+		return sysfs_emit(buf, "<unsupported>\n");
+	if (ret)
+		return ret;
+
+	return sysfs_emit(buf, "%u\n",
+			  cppc_perf_to_khz(&cpu_data->perf_caps, perf));
+}
+
+static ssize_t store_ospm_nominal_freq(struct cpufreq_policy *policy,
+				       const char *buf, size_t count)
+{
+	struct cppc_cpudata *cpu_data = policy->driver_data;
+	unsigned int sib, freq_khz, failing_cpu = 0;
+	u64 prev_perf;
+	u32 perf;
+	int ret;
+
+	ret = kstrtouint(buf, 0, &freq_khz);
+	if (ret)
+		return ret;
+
+	perf = cppc_khz_to_perf(&cpu_data->perf_caps, freq_khz);
+	if (perf < cpu_data->perf_caps.lowest_perf ||
+	    perf > cpu_data->perf_caps.nominal_perf)
+		return -EINVAL;
+
+	/* Save the current value to roll back to if a sibling write fails. */
+	ret = cppc_get_ospm_nominal_perf(policy->cpu, &prev_perf);
+	if (ret)
+		return ret;
+
+	for_each_cpu(sib, policy->cpus) {
+		ret = cppc_set_ospm_nominal_perf(sib, perf);
+		if (ret) {
+			failing_cpu = sib;
+			goto rollback;
+		}
+	}
+
+	return count;
+
+rollback:
+	/*
+	 * Restore the previous value on siblings already updated.
+	 * for_each_cpu() iterates in CPU-id order, so siblings before
+	 * @failing_cpu were updated successfully.
+	 */
+	for_each_cpu(sib, policy->cpus) {
+		if (sib == failing_cpu)
+			break;
+		cppc_set_ospm_nominal_perf(sib, prev_perf);
+	}
+	return ret;
+}
+
 cpufreq_freq_attr_ro(freqdomain_cpus);
 cpufreq_freq_attr_rw(auto_select);
 cpufreq_freq_attr_rw(auto_act_window);
 cpufreq_freq_attr_rw(energy_performance_preference_val);
 cpufreq_freq_attr_rw(perf_limited);
+cpufreq_freq_attr_rw(ospm_nominal_freq);
 
 static struct freq_attr *cppc_cpufreq_attr[] = {
 	&freqdomain_cpus,
@@ -1023,6 +1087,7 @@ static struct freq_attr *cppc_cpufreq_attr[] = {
 	&auto_act_window,
 	&energy_performance_preference_val,
 	&perf_limited,
+	&ospm_nominal_freq,
 	NULL,
 };
 
diff --git a/include/acpi/cppc_acpi.h b/include/acpi/cppc_acpi.h
index 8693890a7275..b545fec3fd47 100644
--- a/include/acpi/cppc_acpi.h
+++ b/include/acpi/cppc_acpi.h
@@ -180,6 +180,8 @@ extern int cpc_write_ffh(int cpunum, struct cpc_reg *reg, u64 val);
 extern int cppc_get_epp_perf(int cpunum, u64 *epp_perf);
 extern int cppc_set_epp_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls, bool enable);
 extern int cppc_set_epp(int cpu, u64 epp_val);
+extern int cppc_set_ospm_nominal_perf(int cpu, u64 ospm_nominal_perf);
+extern int cppc_get_ospm_nominal_perf(int cpu, u64 *ospm_nominal_perf);
 extern int cppc_get_auto_act_window(int cpu, u64 *auto_act_window);
 extern int cppc_set_auto_act_window(int cpu, u64 auto_act_window);
 extern int cppc_get_auto_sel(int cpu, bool *enable);
@@ -266,6 +268,14 @@ static inline int cppc_set_epp(int cpu, u64 epp_val)
 {
 	return -EOPNOTSUPP;
 }
+static inline int cppc_set_ospm_nominal_perf(int cpu, u64 ospm_nominal_perf)
+{
+	return -EOPNOTSUPP;
+}
+static inline int cppc_get_ospm_nominal_perf(int cpu, u64 *ospm_nominal_perf)
+{
+	return -EOPNOTSUPP;
+}
 static inline int cppc_get_auto_act_window(int cpu, u64 *auto_act_window)
 {
 	return -EOPNOTSUPP;
-- 
2.34.1


                 reply	other threads:[~2026-06-15 19:00 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260615185934.2383514-1-sumitg@nvidia.com \
    --to=sumitg@nvidia.com \
    --cc=acpica-devel@lists.linux.dev \
    --cc=bbasu@nvidia.com \
    --cc=jonathanh@nvidia.com \
    --cc=ksitaraman@nvidia.com \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mario.limonciello@amd.com \
    --cc=mochs@nvidia.com \
    --cc=pierre.gondois@arm.com \
    --cc=rafael@kernel.org \
    --cc=saket.dumbre@intel.com \
    --cc=sanjayc@nvidia.com \
    --cc=treding@nvidia.com \
    --cc=viresh.kumar@linaro.org \
    --cc=vsethi@nvidia.com \
    --cc=zhanjie9@hisilicon.com \
    --cc=zhenglifeng1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox