* [PATCH 00/14] amd-pstate cleanups
@ 2025-02-06 21:56 Mario Limonciello
2025-02-06 21:56 ` [PATCH 01/14] cpufreq/amd-pstate: Show a warning when a CPU fails to setup Mario Limonciello
` (13 more replies)
0 siblings, 14 replies; 41+ messages in thread
From: Mario Limonciello @ 2025-02-06 21:56 UTC (permalink / raw)
To: Gautham R . Shenoy, Perry Yuan
Cc: Dhananjay Ugwekar, open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
From: Mario Limonciello <mario.limonciello@amd.com>
This series overhauls locking and drops many unnecessarily cached
variables.
Debugging messages are also dropped in favor of more ftracing.
This series is based off superm1/linux.git bleeding-edge branch.
Mario Limonciello (14):
cpufreq/amd-pstate: Show a warning when a CPU fails to setup
cpufreq/amd-pstate: Drop min and max cached frequencies
cpufreq/amd-pstate: Move perf values into a union
cpufreq/amd-pstate: Overhaul locking
cpufreq/amd-pstate: Drop `cppc_cap1_cached`
cpufreq/amd-pstate-ut: Use _free macro to free put policy
cpufreq/amd-pstate: Replace all AMD_CPPC_* macros with masks
cpufreq/amd-pstate: Cache CPPC request in shared mem case too
cpufreq/amd-pstate: Move all EPP tracing into *_update_perf and
*_set_epp functions
cpufreq/amd-pstate: Update cppc_req_cached for shared mem EPP writes
cpufreq/amd-pstate: Drop debug statements for policy setting
cpufreq/amd-pstate: Cache a pointer to policy in cpudata
cpufreq/amd-pstate: Rework CPPC enabling
cpufreq/amd-pstate: Stop caching EPP
arch/x86/include/asm/msr-index.h | 18 +-
arch/x86/kernel/acpi/cppc.c | 2 +-
drivers/cpufreq/amd-pstate-trace.h | 13 +-
drivers/cpufreq/amd-pstate-ut.c | 72 ++--
drivers/cpufreq/amd-pstate.c | 632 ++++++++++++++---------------
drivers/cpufreq/amd-pstate.h | 64 +--
6 files changed, 398 insertions(+), 403 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 41+ messages in thread
* [PATCH 01/14] cpufreq/amd-pstate: Show a warning when a CPU fails to setup
2025-02-06 21:56 [PATCH 00/14] amd-pstate cleanups Mario Limonciello
@ 2025-02-06 21:56 ` Mario Limonciello
2025-02-10 11:59 ` Dhananjay Ugwekar
2025-02-06 21:56 ` [PATCH 02/14] cpufreq/amd-pstate: Drop min and max cached frequencies Mario Limonciello
` (12 subsequent siblings)
13 siblings, 1 reply; 41+ messages in thread
From: Mario Limonciello @ 2025-02-06 21:56 UTC (permalink / raw)
To: Gautham R . Shenoy, Perry Yuan
Cc: Dhananjay Ugwekar, open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
From: Mario Limonciello <mario.limonciello@amd.com>
I came across a system that MSR_AMD_CPPC_CAP1 for some CPUs isn't
populated. This is an unexpected behavior that is most likely a
BIOS bug. In the event it happens I'd like users to report bugs
to properly root cause and get this fixed.
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
---
drivers/cpufreq/amd-pstate.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index f425fb7ec77d7..573643654e8d6 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -1034,6 +1034,7 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
free_cpudata2:
freq_qos_remove_request(&cpudata->req[0]);
free_cpudata1:
+ pr_warn("Failed to initialize CPU %d: %d\n", policy->cpu, ret);
kfree(cpudata);
return ret;
}
@@ -1527,6 +1528,7 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
return 0;
free_cpudata1:
+ pr_warn("Failed to initialize CPU %d: %d\n", policy->cpu, ret);
kfree(cpudata);
return ret;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH 02/14] cpufreq/amd-pstate: Drop min and max cached frequencies
2025-02-06 21:56 [PATCH 00/14] amd-pstate cleanups Mario Limonciello
2025-02-06 21:56 ` [PATCH 01/14] cpufreq/amd-pstate: Show a warning when a CPU fails to setup Mario Limonciello
@ 2025-02-06 21:56 ` Mario Limonciello
2025-02-07 10:44 ` Dhananjay Ugwekar
2025-02-06 21:56 ` [PATCH 03/14] cpufreq/amd-pstate: Move perf values into a union Mario Limonciello
` (11 subsequent siblings)
13 siblings, 1 reply; 41+ messages in thread
From: Mario Limonciello @ 2025-02-06 21:56 UTC (permalink / raw)
To: Gautham R . Shenoy, Perry Yuan
Cc: Dhananjay Ugwekar, open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
From: Mario Limonciello <mario.limonciello@amd.com>
Use the perf_to_freq helpers to calculate this on the fly.
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
---
drivers/cpufreq/amd-pstate-ut.c | 14 +++----
drivers/cpufreq/amd-pstate.c | 74 ++++++++++-----------------------
drivers/cpufreq/amd-pstate.h | 8 ----
3 files changed, 29 insertions(+), 67 deletions(-)
diff --git a/drivers/cpufreq/amd-pstate-ut.c b/drivers/cpufreq/amd-pstate-ut.c
index 3a0a380c3590c..445278cf40b61 100644
--- a/drivers/cpufreq/amd-pstate-ut.c
+++ b/drivers/cpufreq/amd-pstate-ut.c
@@ -214,14 +214,14 @@ static void amd_pstate_ut_check_freq(u32 index)
break;
cpudata = policy->driver_data;
- if (!((cpudata->max_freq >= cpudata->nominal_freq) &&
+ if (!((policy->cpuinfo.max_freq >= cpudata->nominal_freq) &&
(cpudata->nominal_freq > cpudata->lowest_nonlinear_freq) &&
- (cpudata->lowest_nonlinear_freq > cpudata->min_freq) &&
- (cpudata->min_freq > 0))) {
+ (cpudata->lowest_nonlinear_freq > policy->cpuinfo.min_freq) &&
+ (policy->cpuinfo.min_freq > 0))) {
amd_pstate_ut_cases[index].result = AMD_PSTATE_UT_RESULT_FAIL;
pr_err("%s cpu%d max=%d >= nominal=%d > lowest_nonlinear=%d > min=%d > 0, the formula is incorrect!\n",
- __func__, cpu, cpudata->max_freq, cpudata->nominal_freq,
- cpudata->lowest_nonlinear_freq, cpudata->min_freq);
+ __func__, cpu, policy->cpuinfo.max_freq, cpudata->nominal_freq,
+ cpudata->lowest_nonlinear_freq, policy->cpuinfo.min_freq);
goto skip_test;
}
@@ -233,13 +233,13 @@ static void amd_pstate_ut_check_freq(u32 index)
}
if (cpudata->boost_supported) {
- if ((policy->max == cpudata->max_freq) ||
+ if ((policy->max == policy->cpuinfo.max_freq) ||
(policy->max == cpudata->nominal_freq))
amd_pstate_ut_cases[index].result = AMD_PSTATE_UT_RESULT_PASS;
else {
amd_pstate_ut_cases[index].result = AMD_PSTATE_UT_RESULT_FAIL;
pr_err("%s cpu%d policy_max=%d should be equal cpu_max=%d or cpu_nominal=%d !\n",
- __func__, cpu, policy->max, cpudata->max_freq,
+ __func__, cpu, policy->max, policy->cpuinfo.max_freq,
cpudata->nominal_freq);
goto skip_test;
}
diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 573643654e8d6..668377f55b630 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -615,8 +615,6 @@ static int amd_pstate_update_min_max_limit(struct cpufreq_policy *policy)
WRITE_ONCE(cpudata->max_limit_perf, max_limit_perf);
WRITE_ONCE(cpudata->min_limit_perf, min_limit_perf);
- WRITE_ONCE(cpudata->max_limit_freq, policy->max);
- WRITE_ONCE(cpudata->min_limit_freq, policy->min);
return 0;
}
@@ -628,8 +626,7 @@ static int amd_pstate_update_freq(struct cpufreq_policy *policy,
struct amd_cpudata *cpudata = policy->driver_data;
u8 des_perf;
- if (policy->min != cpudata->min_limit_freq || policy->max != cpudata->max_limit_freq)
- amd_pstate_update_min_max_limit(policy);
+ amd_pstate_update_min_max_limit(policy);
freqs.old = policy->cur;
freqs.new = target_freq;
@@ -684,8 +681,7 @@ static void amd_pstate_adjust_perf(unsigned int cpu,
cpudata = policy->driver_data;
- if (policy->min != cpudata->min_limit_freq || policy->max != cpudata->max_limit_freq)
- amd_pstate_update_min_max_limit(policy);
+ amd_pstate_update_min_max_limit(policy);
cap_perf = READ_ONCE(cpudata->highest_perf);
min_limit_perf = READ_ONCE(cpudata->min_limit_perf);
@@ -717,7 +713,7 @@ static int amd_pstate_cpu_boost_update(struct cpufreq_policy *policy, bool on)
int ret = 0;
nominal_freq = READ_ONCE(cpudata->nominal_freq);
- max_freq = READ_ONCE(cpudata->max_freq);
+ max_freq = perf_to_freq(cpudata, READ_ONCE(cpudata->highest_perf));
if (on)
policy->cpuinfo.max_freq = max_freq;
@@ -901,35 +897,25 @@ static u32 amd_pstate_get_transition_latency(unsigned int cpu)
static int amd_pstate_init_freq(struct amd_cpudata *cpudata)
{
int ret;
- u32 min_freq, max_freq;
- u32 nominal_freq, lowest_nonlinear_freq;
+ u32 min_freq, nominal_freq, lowest_nonlinear_freq;
struct cppc_perf_caps cppc_perf;
ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
if (ret)
return ret;
- if (quirks && quirks->lowest_freq)
- min_freq = quirks->lowest_freq;
- else
- min_freq = cppc_perf.lowest_freq;
-
if (quirks && quirks->nominal_freq)
nominal_freq = quirks->nominal_freq;
else
nominal_freq = cppc_perf.nominal_freq;
- min_freq *= 1000;
nominal_freq *= 1000;
-
WRITE_ONCE(cpudata->nominal_freq, nominal_freq);
- WRITE_ONCE(cpudata->min_freq, min_freq);
- max_freq = perf_to_freq(cpudata, cpudata->highest_perf);
- lowest_nonlinear_freq = perf_to_freq(cpudata, cpudata->lowest_nonlinear_perf);
-
- WRITE_ONCE(cpudata->lowest_nonlinear_freq, lowest_nonlinear_freq);
- WRITE_ONCE(cpudata->max_freq, max_freq);
+ if (quirks && quirks->lowest_freq) {
+ min_freq = quirks->lowest_freq;
+ } else
+ min_freq = cppc_perf.lowest_freq;
/**
* Below values need to be initialized correctly, otherwise driver will fail to load
@@ -937,12 +923,15 @@ static int amd_pstate_init_freq(struct amd_cpudata *cpudata)
* lowest_nonlinear_freq is a value between [min_freq, nominal_freq]
* Check _CPC in ACPI table objects if any values are incorrect
*/
- if (min_freq <= 0 || max_freq <= 0 || nominal_freq <= 0 || min_freq > max_freq) {
- pr_err("min_freq(%d) or max_freq(%d) or nominal_freq(%d) value is incorrect\n",
- min_freq, max_freq, nominal_freq);
+ if (nominal_freq <= 0) {
+ pr_err("nominal_freq(%d) value is incorrect\n",
+ nominal_freq);
return -EINVAL;
}
+ lowest_nonlinear_freq = perf_to_freq(cpudata, cpudata->lowest_nonlinear_perf);
+ WRITE_ONCE(cpudata->lowest_nonlinear_freq, lowest_nonlinear_freq);
+
if (lowest_nonlinear_freq <= min_freq || lowest_nonlinear_freq > nominal_freq) {
pr_err("lowest_nonlinear_freq(%d) value is out of range [min_freq(%d), nominal_freq(%d)]\n",
lowest_nonlinear_freq, min_freq, nominal_freq);
@@ -954,9 +943,9 @@ static int amd_pstate_init_freq(struct amd_cpudata *cpudata)
static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
{
- int min_freq, max_freq, ret;
- struct device *dev;
struct amd_cpudata *cpudata;
+ struct device *dev;
+ int ret;
/*
* Resetting PERF_CTL_MSR will put the CPU in P0 frequency,
@@ -987,17 +976,11 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
if (ret)
goto free_cpudata1;
- min_freq = READ_ONCE(cpudata->min_freq);
- max_freq = READ_ONCE(cpudata->max_freq);
-
policy->cpuinfo.transition_latency = amd_pstate_get_transition_latency(policy->cpu);
policy->transition_delay_us = amd_pstate_get_transition_delay_us(policy->cpu);
- policy->min = min_freq;
- policy->max = max_freq;
-
- policy->cpuinfo.min_freq = min_freq;
- policy->cpuinfo.max_freq = max_freq;
+ policy->cpuinfo.min_freq = policy->min = perf_to_freq(cpudata, cpudata->lowest_perf);
+ policy->cpuinfo.max_freq = policy->max = perf_to_freq(cpudata, cpudata->highest_perf);
policy->boost_enabled = READ_ONCE(cpudata->boost_supported);
@@ -1021,9 +1004,6 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
goto free_cpudata2;
}
- cpudata->max_limit_freq = max_freq;
- cpudata->min_limit_freq = min_freq;
-
policy->driver_data = cpudata;
if (!current_pstate_driver->adjust_perf)
@@ -1081,14 +1061,10 @@ static int amd_pstate_cpu_suspend(struct cpufreq_policy *policy)
static ssize_t show_amd_pstate_max_freq(struct cpufreq_policy *policy,
char *buf)
{
- int max_freq;
struct amd_cpudata *cpudata = policy->driver_data;
- max_freq = READ_ONCE(cpudata->max_freq);
- if (max_freq < 0)
- return max_freq;
- return sysfs_emit(buf, "%u\n", max_freq);
+ return sysfs_emit(buf, "%u\n", perf_to_freq(cpudata, READ_ONCE(cpudata->highest_perf)));
}
static ssize_t show_amd_pstate_lowest_nonlinear_freq(struct cpufreq_policy *policy,
@@ -1446,10 +1422,10 @@ static bool amd_pstate_acpi_pm_profile_undefined(void)
static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
{
- int min_freq, max_freq, ret;
struct amd_cpudata *cpudata;
struct device *dev;
u64 value;
+ int ret;
/*
* Resetting PERF_CTL_MSR will put the CPU in P0 frequency,
@@ -1480,19 +1456,13 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
if (ret)
goto free_cpudata1;
- min_freq = READ_ONCE(cpudata->min_freq);
- max_freq = READ_ONCE(cpudata->max_freq);
-
- policy->cpuinfo.min_freq = min_freq;
- policy->cpuinfo.max_freq = max_freq;
+ policy->cpuinfo.min_freq = policy->min = perf_to_freq(cpudata, cpudata->lowest_perf);
+ policy->cpuinfo.max_freq = policy->max = perf_to_freq(cpudata, cpudata->highest_perf);
/* It will be updated by governor */
policy->cur = policy->cpuinfo.min_freq;
policy->driver_data = cpudata;
- policy->min = policy->cpuinfo.min_freq;
- policy->max = policy->cpuinfo.max_freq;
-
policy->boost_enabled = READ_ONCE(cpudata->boost_supported);
/*
diff --git a/drivers/cpufreq/amd-pstate.h b/drivers/cpufreq/amd-pstate.h
index 19d405c6d805e..472044a1de43b 100644
--- a/drivers/cpufreq/amd-pstate.h
+++ b/drivers/cpufreq/amd-pstate.h
@@ -44,10 +44,6 @@ struct amd_aperf_mperf {
* priority.
* @min_limit_perf: Cached value of the performance corresponding to policy->min
* @max_limit_perf: Cached value of the performance corresponding to policy->max
- * @min_limit_freq: Cached value of policy->min (in khz)
- * @max_limit_freq: Cached value of policy->max (in khz)
- * @max_freq: the frequency (in khz) that mapped to highest_perf
- * @min_freq: the frequency (in khz) that mapped to lowest_perf
* @nominal_freq: the frequency (in khz) that mapped to nominal_perf
* @lowest_nonlinear_freq: the frequency (in khz) that mapped to lowest_nonlinear_perf
* @cur: Difference of Aperf/Mperf/tsc count between last and current sample
@@ -77,11 +73,7 @@ struct amd_cpudata {
u8 prefcore_ranking;
u8 min_limit_perf;
u8 max_limit_perf;
- u32 min_limit_freq;
- u32 max_limit_freq;
- u32 max_freq;
- u32 min_freq;
u32 nominal_freq;
u32 lowest_nonlinear_freq;
--
2.43.0
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH 03/14] cpufreq/amd-pstate: Move perf values into a union
2025-02-06 21:56 [PATCH 00/14] amd-pstate cleanups Mario Limonciello
2025-02-06 21:56 ` [PATCH 01/14] cpufreq/amd-pstate: Show a warning when a CPU fails to setup Mario Limonciello
2025-02-06 21:56 ` [PATCH 02/14] cpufreq/amd-pstate: Drop min and max cached frequencies Mario Limonciello
@ 2025-02-06 21:56 ` Mario Limonciello
2025-02-10 13:38 ` Dhananjay Ugwekar
2025-02-06 21:56 ` [PATCH 04/14] cpufreq/amd-pstate: Overhaul locking Mario Limonciello
` (10 subsequent siblings)
13 siblings, 1 reply; 41+ messages in thread
From: Mario Limonciello @ 2025-02-06 21:56 UTC (permalink / raw)
To: Gautham R . Shenoy, Perry Yuan
Cc: Dhananjay Ugwekar, open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
From: Mario Limonciello <mario.limonciello@amd.com>
By storing perf values in a union all the writes and reads can
be done atomically, removing the need for some concurrency protections.
While making this change, also drop the cached frequency values,
using inline helpers to calculate them on demand from perf value.
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
---
drivers/cpufreq/amd-pstate-ut.c | 17 +--
drivers/cpufreq/amd-pstate.c | 212 +++++++++++++++++++-------------
drivers/cpufreq/amd-pstate.h | 48 +++++---
3 files changed, 163 insertions(+), 114 deletions(-)
diff --git a/drivers/cpufreq/amd-pstate-ut.c b/drivers/cpufreq/amd-pstate-ut.c
index 445278cf40b61..d9ab98c6f56b1 100644
--- a/drivers/cpufreq/amd-pstate-ut.c
+++ b/drivers/cpufreq/amd-pstate-ut.c
@@ -162,19 +162,20 @@ static void amd_pstate_ut_check_perf(u32 index)
lowest_perf = AMD_CPPC_LOWEST_PERF(cap1);
}
- if (highest_perf != READ_ONCE(cpudata->highest_perf) && !cpudata->hw_prefcore) {
+ if (highest_perf != READ_ONCE(cpudata->perf.highest_perf) &&
+ !cpudata->hw_prefcore) {
pr_err("%s cpu%d highest=%d %d highest perf doesn't match\n",
- __func__, cpu, highest_perf, cpudata->highest_perf);
+ __func__, cpu, highest_perf, cpudata->perf.highest_perf);
goto skip_test;
}
- if ((nominal_perf != READ_ONCE(cpudata->nominal_perf)) ||
- (lowest_nonlinear_perf != READ_ONCE(cpudata->lowest_nonlinear_perf)) ||
- (lowest_perf != READ_ONCE(cpudata->lowest_perf))) {
+ if ((nominal_perf != READ_ONCE(cpudata->perf.nominal_perf)) ||
+ (lowest_nonlinear_perf != READ_ONCE(cpudata->perf.lowest_nonlinear_perf)) ||
+ (lowest_perf != READ_ONCE(cpudata->perf.lowest_perf))) {
amd_pstate_ut_cases[index].result = AMD_PSTATE_UT_RESULT_FAIL;
pr_err("%s cpu%d nominal=%d %d lowest_nonlinear=%d %d lowest=%d %d, they should be equal!\n",
- __func__, cpu, nominal_perf, cpudata->nominal_perf,
- lowest_nonlinear_perf, cpudata->lowest_nonlinear_perf,
- lowest_perf, cpudata->lowest_perf);
+ __func__, cpu, nominal_perf, cpudata->perf.nominal_perf,
+ lowest_nonlinear_perf, cpudata->perf.lowest_nonlinear_perf,
+ lowest_perf, cpudata->perf.lowest_perf);
goto skip_test;
}
diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 668377f55b630..77bc6418731ee 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -142,18 +142,17 @@ static struct quirk_entry quirk_amd_7k62 = {
.lowest_freq = 550,
};
-static inline u8 freq_to_perf(struct amd_cpudata *cpudata, unsigned int freq_val)
+static inline u8 freq_to_perf(union perf_cached perf, u32 nominal_freq, unsigned int freq_val)
{
- u8 perf_val = DIV_ROUND_UP_ULL((u64)freq_val * cpudata->nominal_perf,
- cpudata->nominal_freq);
+ u8 perf_val = DIV_ROUND_UP_ULL((u64)freq_val * perf.nominal_perf, nominal_freq);
- return clamp_t(u8, perf_val, cpudata->lowest_perf, cpudata->highest_perf);
+ return clamp_t(u8, perf_val, perf.lowest_perf, perf.highest_perf);
}
-static inline u32 perf_to_freq(struct amd_cpudata *cpudata, u8 perf_val)
+static inline u32 perf_to_freq(union perf_cached perf, u32 nominal_freq, u8 perf_val)
{
- return DIV_ROUND_UP_ULL((u64)cpudata->nominal_freq * perf_val,
- cpudata->nominal_perf);
+ return DIV_ROUND_UP_ULL((u64)nominal_freq * perf_val,
+ perf.nominal_perf);
}
static int __init dmi_matched_7k62_bios_bug(const struct dmi_system_id *dmi)
@@ -347,7 +346,9 @@ static int amd_pstate_set_energy_pref_index(struct cpufreq_policy *policy,
}
if (trace_amd_pstate_epp_perf_enabled()) {
- trace_amd_pstate_epp_perf(cpudata->cpu, cpudata->highest_perf,
+ union perf_cached perf = cpudata->perf;
+
+ trace_amd_pstate_epp_perf(cpudata->cpu, perf.highest_perf,
epp,
FIELD_GET(AMD_CPPC_MIN_PERF_MASK, cpudata->cppc_req_cached),
FIELD_GET(AMD_CPPC_MAX_PERF_MASK, cpudata->cppc_req_cached),
@@ -425,6 +426,7 @@ static inline int amd_pstate_cppc_enable(bool enable)
static int msr_init_perf(struct amd_cpudata *cpudata)
{
+ union perf_cached perf = cpudata->perf;
u64 cap1, numerator;
int ret = rdmsrl_safe_on_cpu(cpudata->cpu, MSR_AMD_CPPC_CAP1,
@@ -436,19 +438,21 @@ static int msr_init_perf(struct amd_cpudata *cpudata)
if (ret)
return ret;
- WRITE_ONCE(cpudata->highest_perf, numerator);
- WRITE_ONCE(cpudata->max_limit_perf, numerator);
- WRITE_ONCE(cpudata->nominal_perf, AMD_CPPC_NOMINAL_PERF(cap1));
- WRITE_ONCE(cpudata->lowest_nonlinear_perf, AMD_CPPC_LOWNONLIN_PERF(cap1));
- WRITE_ONCE(cpudata->lowest_perf, AMD_CPPC_LOWEST_PERF(cap1));
+ perf.highest_perf = numerator;
+ perf.max_limit_perf = numerator;
+ perf.min_limit_perf = AMD_CPPC_LOWEST_PERF(cap1);
+ perf.nominal_perf = AMD_CPPC_NOMINAL_PERF(cap1);
+ perf.lowest_nonlinear_perf = AMD_CPPC_LOWNONLIN_PERF(cap1);
+ perf.lowest_perf = AMD_CPPC_LOWEST_PERF(cap1);
+ WRITE_ONCE(cpudata->perf, perf);
WRITE_ONCE(cpudata->prefcore_ranking, AMD_CPPC_HIGHEST_PERF(cap1));
- WRITE_ONCE(cpudata->min_limit_perf, AMD_CPPC_LOWEST_PERF(cap1));
return 0;
}
static int shmem_init_perf(struct amd_cpudata *cpudata)
{
struct cppc_perf_caps cppc_perf;
+ union perf_cached perf = cpudata->perf;
u64 numerator;
int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
@@ -459,14 +463,14 @@ static int shmem_init_perf(struct amd_cpudata *cpudata)
if (ret)
return ret;
- WRITE_ONCE(cpudata->highest_perf, numerator);
- WRITE_ONCE(cpudata->max_limit_perf, numerator);
- WRITE_ONCE(cpudata->nominal_perf, cppc_perf.nominal_perf);
- WRITE_ONCE(cpudata->lowest_nonlinear_perf,
- cppc_perf.lowest_nonlinear_perf);
- WRITE_ONCE(cpudata->lowest_perf, cppc_perf.lowest_perf);
+ perf.highest_perf = numerator;
+ perf.max_limit_perf = numerator;
+ perf.min_limit_perf = cppc_perf.lowest_perf;
+ perf.nominal_perf = cppc_perf.nominal_perf;
+ perf.lowest_nonlinear_perf = cppc_perf.lowest_nonlinear_perf;
+ perf.lowest_perf = cppc_perf.lowest_perf;
+ WRITE_ONCE(cpudata->perf, perf);
WRITE_ONCE(cpudata->prefcore_ranking, cppc_perf.highest_perf);
- WRITE_ONCE(cpudata->min_limit_perf, cppc_perf.lowest_perf);
if (cppc_state == AMD_PSTATE_ACTIVE)
return 0;
@@ -549,14 +553,14 @@ static void amd_pstate_update(struct amd_cpudata *cpudata, u8 min_perf,
u8 des_perf, u8 max_perf, bool fast_switch, int gov_flags)
{
struct cpufreq_policy *policy __free(put_cpufreq_policy) = cpufreq_cpu_get(cpudata->cpu);
- u8 nominal_perf = READ_ONCE(cpudata->nominal_perf);
+ union perf_cached perf = READ_ONCE(cpudata->perf);
if (!policy)
return;
des_perf = clamp_t(u8, des_perf, min_perf, max_perf);
- policy->cur = perf_to_freq(cpudata, des_perf);
+ policy->cur = perf_to_freq(perf, cpudata->nominal_freq, des_perf);
if ((cppc_state == AMD_PSTATE_GUIDED) && (gov_flags & CPUFREQ_GOV_DYNAMIC_SWITCHING)) {
min_perf = des_perf;
@@ -565,7 +569,7 @@ static void amd_pstate_update(struct amd_cpudata *cpudata, u8 min_perf,
/* limit the max perf when core performance boost feature is disabled */
if (!cpudata->boost_supported)
- max_perf = min_t(u8, nominal_perf, max_perf);
+ max_perf = min_t(u8, perf.nominal_perf, max_perf);
if (trace_amd_pstate_perf_enabled() && amd_pstate_sample(cpudata)) {
trace_amd_pstate_perf(min_perf, des_perf, max_perf, cpudata->freq,
@@ -602,36 +606,41 @@ static int amd_pstate_verify(struct cpufreq_policy_data *policy_data)
return 0;
}
-static int amd_pstate_update_min_max_limit(struct cpufreq_policy *policy)
+static void amd_pstate_update_min_max_limit(struct cpufreq_policy *policy)
{
- u8 max_limit_perf, min_limit_perf;
struct amd_cpudata *cpudata = policy->driver_data;
+ union perf_cached perf = READ_ONCE(cpudata->perf);
- max_limit_perf = freq_to_perf(cpudata, policy->max);
- min_limit_perf = freq_to_perf(cpudata, policy->min);
+ if (policy->min == perf_to_freq(perf, cpudata->nominal_freq, perf.min_limit_perf) &&
+ policy->max == perf_to_freq(perf, cpudata->nominal_freq, perf.max_limit_perf))
+ return;
- if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
- min_limit_perf = min(cpudata->nominal_perf, max_limit_perf);
+ perf.max_limit_perf = freq_to_perf(perf, cpudata->nominal_freq, policy->max);
+ perf.min_limit_perf = freq_to_perf(perf, cpudata->nominal_freq, policy->min);
- WRITE_ONCE(cpudata->max_limit_perf, max_limit_perf);
- WRITE_ONCE(cpudata->min_limit_perf, min_limit_perf);
+ if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
+ perf.min_limit_perf = min(perf.nominal_perf, perf.max_limit_perf);
- return 0;
+ WRITE_ONCE(cpudata->perf, perf);
}
static int amd_pstate_update_freq(struct cpufreq_policy *policy,
unsigned int target_freq, bool fast_switch)
{
struct cpufreq_freqs freqs;
- struct amd_cpudata *cpudata = policy->driver_data;
+ struct amd_cpudata *cpudata;
+ union perf_cached perf;
u8 des_perf;
amd_pstate_update_min_max_limit(policy);
+ cpudata = policy->driver_data;
+ perf = READ_ONCE(cpudata->perf);
+
freqs.old = policy->cur;
freqs.new = target_freq;
- des_perf = freq_to_perf(cpudata, target_freq);
+ des_perf = freq_to_perf(perf, cpudata->nominal_freq, target_freq);
WARN_ON(fast_switch && !policy->fast_switch_enabled);
/*
@@ -642,8 +651,8 @@ static int amd_pstate_update_freq(struct cpufreq_policy *policy,
if (!fast_switch)
cpufreq_freq_transition_begin(policy, &freqs);
- amd_pstate_update(cpudata, cpudata->min_limit_perf, des_perf,
- cpudata->max_limit_perf, fast_switch,
+ amd_pstate_update(cpudata, perf.min_limit_perf, des_perf,
+ perf.max_limit_perf, fast_switch,
policy->governor->flags);
if (!fast_switch)
@@ -672,19 +681,19 @@ static void amd_pstate_adjust_perf(unsigned int cpu,
unsigned long target_perf,
unsigned long capacity)
{
- u8 max_perf, min_perf, des_perf, cap_perf, min_limit_perf;
+ u8 max_perf, min_perf, des_perf, cap_perf;
struct cpufreq_policy *policy __free(put_cpufreq_policy) = cpufreq_cpu_get(cpu);
struct amd_cpudata *cpudata;
+ union perf_cached perf;
if (!policy)
return;
- cpudata = policy->driver_data;
-
amd_pstate_update_min_max_limit(policy);
- cap_perf = READ_ONCE(cpudata->highest_perf);
- min_limit_perf = READ_ONCE(cpudata->min_limit_perf);
+ cpudata = policy->driver_data;
+ perf = READ_ONCE(cpudata->perf);
+ cap_perf = perf.highest_perf;
des_perf = cap_perf;
if (target_perf < capacity)
@@ -695,10 +704,10 @@ static void amd_pstate_adjust_perf(unsigned int cpu,
else
min_perf = cap_perf;
- if (min_perf < min_limit_perf)
- min_perf = min_limit_perf;
+ if (min_perf < perf.min_limit_perf)
+ min_perf = perf.min_limit_perf;
- max_perf = cpudata->max_limit_perf;
+ max_perf = perf.max_limit_perf;
if (max_perf < min_perf)
max_perf = min_perf;
@@ -709,11 +718,12 @@ static void amd_pstate_adjust_perf(unsigned int cpu,
static int amd_pstate_cpu_boost_update(struct cpufreq_policy *policy, bool on)
{
struct amd_cpudata *cpudata = policy->driver_data;
+ union perf_cached perf = READ_ONCE(cpudata->perf);
u32 nominal_freq, max_freq;
int ret = 0;
nominal_freq = READ_ONCE(cpudata->nominal_freq);
- max_freq = perf_to_freq(cpudata, READ_ONCE(cpudata->highest_perf));
+ max_freq = perf_to_freq(perf, cpudata->nominal_freq, perf.highest_perf);
if (on)
policy->cpuinfo.max_freq = max_freq;
@@ -884,25 +894,24 @@ static u32 amd_pstate_get_transition_latency(unsigned int cpu)
}
/*
- * amd_pstate_init_freq: Initialize the max_freq, min_freq,
- * nominal_freq and lowest_nonlinear_freq for
- * the @cpudata object.
+ * amd_pstate_init_freq: Initialize the nominal_freq and lowest_nonlinear_freq
+ * for the @cpudata object.
*
- * Requires: highest_perf, lowest_perf, nominal_perf and
- * lowest_nonlinear_perf members of @cpudata to be
- * initialized.
+ * Requires: all perf members of @cpudata to be initialized.
*
- * Returns 0 on success, non-zero value on failure.
+ * Returns 0 on success, non-zero value on failure.
*/
static int amd_pstate_init_freq(struct amd_cpudata *cpudata)
{
- int ret;
u32 min_freq, nominal_freq, lowest_nonlinear_freq;
struct cppc_perf_caps cppc_perf;
+ union perf_cached perf;
+ int ret;
ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
if (ret)
return ret;
+ perf = READ_ONCE(cpudata->perf);
if (quirks && quirks->nominal_freq)
nominal_freq = quirks->nominal_freq;
@@ -914,6 +923,7 @@ static int amd_pstate_init_freq(struct amd_cpudata *cpudata)
if (quirks && quirks->lowest_freq) {
min_freq = quirks->lowest_freq;
+ perf.lowest_perf = freq_to_perf(perf, nominal_freq, min_freq);
} else
min_freq = cppc_perf.lowest_freq;
@@ -929,7 +939,7 @@ static int amd_pstate_init_freq(struct amd_cpudata *cpudata)
return -EINVAL;
}
- lowest_nonlinear_freq = perf_to_freq(cpudata, cpudata->lowest_nonlinear_perf);
+ lowest_nonlinear_freq = perf_to_freq(perf, nominal_freq, perf.lowest_nonlinear_perf);
WRITE_ONCE(cpudata->lowest_nonlinear_freq, lowest_nonlinear_freq);
if (lowest_nonlinear_freq <= min_freq || lowest_nonlinear_freq > nominal_freq) {
@@ -944,6 +954,7 @@ static int amd_pstate_init_freq(struct amd_cpudata *cpudata)
static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
{
struct amd_cpudata *cpudata;
+ union perf_cached perf;
struct device *dev;
int ret;
@@ -979,8 +990,14 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
policy->cpuinfo.transition_latency = amd_pstate_get_transition_latency(policy->cpu);
policy->transition_delay_us = amd_pstate_get_transition_delay_us(policy->cpu);
- policy->cpuinfo.min_freq = policy->min = perf_to_freq(cpudata, cpudata->lowest_perf);
- policy->cpuinfo.max_freq = policy->max = perf_to_freq(cpudata, cpudata->highest_perf);
+ perf = READ_ONCE(cpudata->perf);
+
+ policy->cpuinfo.min_freq = policy->min = perf_to_freq(perf,
+ cpudata->nominal_freq,
+ perf.lowest_perf);
+ policy->cpuinfo.max_freq = policy->max = perf_to_freq(perf,
+ cpudata->nominal_freq,
+ perf.highest_perf);
policy->boost_enabled = READ_ONCE(cpudata->boost_supported);
@@ -1061,23 +1078,33 @@ static int amd_pstate_cpu_suspend(struct cpufreq_policy *policy)
static ssize_t show_amd_pstate_max_freq(struct cpufreq_policy *policy,
char *buf)
{
- struct amd_cpudata *cpudata = policy->driver_data;
+ struct amd_cpudata *cpudata;
+ union perf_cached perf;
+
+ if (!policy)
+ return -EINVAL;
+ cpudata = policy->driver_data;
+ perf = READ_ONCE(cpudata->perf);
- return sysfs_emit(buf, "%u\n", perf_to_freq(cpudata, READ_ONCE(cpudata->highest_perf)));
+ return sysfs_emit(buf, "%u\n",
+ perf_to_freq(perf, cpudata->nominal_freq, perf.max_limit_perf));
}
static ssize_t show_amd_pstate_lowest_nonlinear_freq(struct cpufreq_policy *policy,
char *buf)
{
- int freq;
- struct amd_cpudata *cpudata = policy->driver_data;
+ struct amd_cpudata *cpudata;
+ union perf_cached perf;
+
+ if (!policy)
+ return -EINVAL;
- freq = READ_ONCE(cpudata->lowest_nonlinear_freq);
- if (freq < 0)
- return freq;
+ cpudata = policy->driver_data;
+ perf = READ_ONCE(cpudata->perf);
- return sysfs_emit(buf, "%u\n", freq);
+ return sysfs_emit(buf, "%u\n",
+ perf_to_freq(perf, cpudata->nominal_freq, perf.lowest_nonlinear_perf));
}
/*
@@ -1087,12 +1114,14 @@ static ssize_t show_amd_pstate_lowest_nonlinear_freq(struct cpufreq_policy *poli
static ssize_t show_amd_pstate_highest_perf(struct cpufreq_policy *policy,
char *buf)
{
- u8 perf;
- struct amd_cpudata *cpudata = policy->driver_data;
+ struct amd_cpudata *cpudata;
- perf = READ_ONCE(cpudata->highest_perf);
+ if (!policy)
+ return -EINVAL;
- return sysfs_emit(buf, "%u\n", perf);
+ cpudata = policy->driver_data;
+
+ return sysfs_emit(buf, "%u\n", cpudata->perf.highest_perf);
}
static ssize_t show_amd_pstate_prefcore_ranking(struct cpufreq_policy *policy,
@@ -1423,6 +1452,7 @@ static bool amd_pstate_acpi_pm_profile_undefined(void)
static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
{
struct amd_cpudata *cpudata;
+ union perf_cached perf;
struct device *dev;
u64 value;
int ret;
@@ -1456,8 +1486,15 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
if (ret)
goto free_cpudata1;
- policy->cpuinfo.min_freq = policy->min = perf_to_freq(cpudata, cpudata->lowest_perf);
- policy->cpuinfo.max_freq = policy->max = perf_to_freq(cpudata, cpudata->highest_perf);
+ perf = READ_ONCE(cpudata->perf);
+
+ policy->cpuinfo.min_freq = policy->min = perf_to_freq(perf,
+ cpudata->nominal_freq,
+ perf.lowest_perf);
+ policy->cpuinfo.max_freq = policy->max = perf_to_freq(perf,
+ cpudata->nominal_freq,
+ perf.highest_perf);
+
/* It will be updated by governor */
policy->cur = policy->cpuinfo.min_freq;
@@ -1518,6 +1555,7 @@ static void amd_pstate_epp_cpu_exit(struct cpufreq_policy *policy)
static int amd_pstate_epp_update_limit(struct cpufreq_policy *policy)
{
struct amd_cpudata *cpudata = policy->driver_data;
+ union perf_cached perf;
u8 epp;
amd_pstate_update_min_max_limit(policy);
@@ -1527,15 +1565,16 @@ static int amd_pstate_epp_update_limit(struct cpufreq_policy *policy)
else
epp = READ_ONCE(cpudata->epp_cached);
+ perf = READ_ONCE(cpudata->perf);
if (trace_amd_pstate_epp_perf_enabled()) {
- trace_amd_pstate_epp_perf(cpudata->cpu, cpudata->highest_perf, epp,
- cpudata->min_limit_perf,
- cpudata->max_limit_perf,
+ trace_amd_pstate_epp_perf(cpudata->cpu, perf.highest_perf, epp,
+ perf.min_limit_perf,
+ perf.max_limit_perf,
policy->boost_enabled);
}
- return amd_pstate_update_perf(cpudata, cpudata->min_limit_perf, 0U,
- cpudata->max_limit_perf, epp, false);
+ return amd_pstate_update_perf(cpudata, perf.min_limit_perf, 0U,
+ perf.max_limit_perf, epp, false);
}
static int amd_pstate_epp_set_policy(struct cpufreq_policy *policy)
@@ -1567,23 +1606,21 @@ static int amd_pstate_epp_set_policy(struct cpufreq_policy *policy)
static int amd_pstate_epp_reenable(struct cpufreq_policy *policy)
{
struct amd_cpudata *cpudata = policy->driver_data;
- u8 max_perf;
+ union perf_cached perf = cpudata->perf;
int ret;
ret = amd_pstate_cppc_enable(true);
if (ret)
pr_err("failed to enable amd pstate during resume, return %d\n", ret);
- max_perf = READ_ONCE(cpudata->highest_perf);
-
if (trace_amd_pstate_epp_perf_enabled()) {
- trace_amd_pstate_epp_perf(cpudata->cpu, cpudata->highest_perf,
+ trace_amd_pstate_epp_perf(cpudata->cpu, perf.highest_perf,
cpudata->epp_cached,
FIELD_GET(AMD_CPPC_MIN_PERF_MASK, cpudata->cppc_req_cached),
- max_perf, policy->boost_enabled);
+ perf.highest_perf, policy->boost_enabled);
}
- return amd_pstate_update_perf(cpudata, 0, 0, max_perf, cpudata->epp_cached, false);
+ return amd_pstate_update_perf(cpudata, 0, 0, perf.highest_perf, cpudata->epp_cached, false);
}
static int amd_pstate_epp_cpu_online(struct cpufreq_policy *policy)
@@ -1604,22 +1641,21 @@ static int amd_pstate_epp_cpu_online(struct cpufreq_policy *policy)
static int amd_pstate_epp_cpu_offline(struct cpufreq_policy *policy)
{
struct amd_cpudata *cpudata = policy->driver_data;
- u8 min_perf;
+ union perf_cached perf = cpudata->perf;
if (cpudata->suspended)
return 0;
- min_perf = READ_ONCE(cpudata->lowest_perf);
-
guard(mutex)(&amd_pstate_limits_lock);
if (trace_amd_pstate_epp_perf_enabled()) {
- trace_amd_pstate_epp_perf(cpudata->cpu, cpudata->highest_perf,
+ trace_amd_pstate_epp_perf(cpudata->cpu, perf.highest_perf,
AMD_CPPC_EPP_BALANCE_POWERSAVE,
- min_perf, min_perf, policy->boost_enabled);
+ perf.lowest_perf, perf.lowest_perf,
+ policy->boost_enabled);
}
- return amd_pstate_update_perf(cpudata, min_perf, 0, min_perf,
+ return amd_pstate_update_perf(cpudata, perf.lowest_perf, 0, perf.lowest_perf,
AMD_CPPC_EPP_BALANCE_POWERSAVE, false);
}
diff --git a/drivers/cpufreq/amd-pstate.h b/drivers/cpufreq/amd-pstate.h
index 472044a1de43b..a140704b97430 100644
--- a/drivers/cpufreq/amd-pstate.h
+++ b/drivers/cpufreq/amd-pstate.h
@@ -13,6 +13,34 @@
/*********************************************************************
* AMD P-state INTERFACE *
*********************************************************************/
+
+/**
+ * union perf_cached - A union to cache performance-related data.
+ * @highest_perf: the maximum performance an individual processor may reach,
+ * assuming ideal conditions
+ * For platforms that do not support the preferred core feature, the
+ * highest_pef may be configured with 166 or 255, to avoid max frequency
+ * calculated wrongly. we take the fixed value as the highest_perf.
+ * @nominal_perf: the maximum sustained performance level of the processor,
+ * assuming ideal operating conditions
+ * @lowest_nonlinear_perf: the lowest performance level at which nonlinear power
+ * savings are achieved
+ * @lowest_perf: the absolute lowest performance level of the processor
+ * @min_limit_perf: Cached value of the performance corresponding to policy->min
+ * @max_limit_perf: Cached value of the performance corresponding to policy->max
+ */
+union perf_cached {
+ struct {
+ u8 highest_perf;
+ u8 nominal_perf;
+ u8 lowest_nonlinear_perf;
+ u8 lowest_perf;
+ u8 min_limit_perf;
+ u8 max_limit_perf;
+ };
+ u64 val;
+};
+
/**
* struct amd_aperf_mperf
* @aperf: actual performance frequency clock count
@@ -30,20 +58,8 @@ struct amd_aperf_mperf {
* @cpu: CPU number
* @req: constraint request to apply
* @cppc_req_cached: cached performance request hints
- * @highest_perf: the maximum performance an individual processor may reach,
- * assuming ideal conditions
- * For platforms that do not support the preferred core feature, the
- * highest_pef may be configured with 166 or 255, to avoid max frequency
- * calculated wrongly. we take the fixed value as the highest_perf.
- * @nominal_perf: the maximum sustained performance level of the processor,
- * assuming ideal operating conditions
- * @lowest_nonlinear_perf: the lowest performance level at which nonlinear power
- * savings are achieved
- * @lowest_perf: the absolute lowest performance level of the processor
* @prefcore_ranking: the preferred core ranking, the higher value indicates a higher
* priority.
- * @min_limit_perf: Cached value of the performance corresponding to policy->min
- * @max_limit_perf: Cached value of the performance corresponding to policy->max
* @nominal_freq: the frequency (in khz) that mapped to nominal_perf
* @lowest_nonlinear_freq: the frequency (in khz) that mapped to lowest_nonlinear_perf
* @cur: Difference of Aperf/Mperf/tsc count between last and current sample
@@ -66,13 +82,9 @@ struct amd_cpudata {
struct freq_qos_request req[2];
u64 cppc_req_cached;
- u8 highest_perf;
- u8 nominal_perf;
- u8 lowest_nonlinear_perf;
- u8 lowest_perf;
+ union perf_cached perf;
+
u8 prefcore_ranking;
- u8 min_limit_perf;
- u8 max_limit_perf;
u32 nominal_freq;
u32 lowest_nonlinear_freq;
--
2.43.0
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH 04/14] cpufreq/amd-pstate: Overhaul locking
2025-02-06 21:56 [PATCH 00/14] amd-pstate cleanups Mario Limonciello
` (2 preceding siblings ...)
2025-02-06 21:56 ` [PATCH 03/14] cpufreq/amd-pstate: Move perf values into a union Mario Limonciello
@ 2025-02-06 21:56 ` Mario Limonciello
2025-02-11 5:02 ` Dhananjay Ugwekar
2025-02-06 21:56 ` [PATCH 05/14] cpufreq/amd-pstate: Drop `cppc_cap1_cached` Mario Limonciello
` (9 subsequent siblings)
13 siblings, 1 reply; 41+ messages in thread
From: Mario Limonciello @ 2025-02-06 21:56 UTC (permalink / raw)
To: Gautham R . Shenoy, Perry Yuan
Cc: Dhananjay Ugwekar, open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
From: Mario Limonciello <mario.limonciello@amd.com>
amd_pstate_cpu_boost_update() and refresh_frequency_limits() both
update the policy state and have nothing to do with the amd-pstate
driver itself.
A global "limits" lock doesn't make sense because each CPU can have
policies changed independently. Instead introduce locks into to the
cpudata structure and lock each CPU independently.
The remaining "global" driver lock is used to ensure that only one
entity can change driver modes at a given time.
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
---
drivers/cpufreq/amd-pstate.c | 27 +++++++++++++++++----------
drivers/cpufreq/amd-pstate.h | 2 ++
2 files changed, 19 insertions(+), 10 deletions(-)
diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 77bc6418731ee..dd230ed3b9579 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -196,7 +196,6 @@ static inline int get_mode_idx_from_str(const char *str, size_t size)
return -EINVAL;
}
-static DEFINE_MUTEX(amd_pstate_limits_lock);
static DEFINE_MUTEX(amd_pstate_driver_lock);
static u8 msr_get_epp(struct amd_cpudata *cpudata)
@@ -283,6 +282,8 @@ static int msr_set_epp(struct amd_cpudata *cpudata, u8 epp)
u64 value, prev;
int ret;
+ lockdep_assert_held(&cpudata->lock);
+
value = prev = READ_ONCE(cpudata->cppc_req_cached);
value &= ~AMD_CPPC_EPP_PERF_MASK;
value |= FIELD_PREP(AMD_CPPC_EPP_PERF_MASK, epp);
@@ -315,6 +316,8 @@ static int shmem_set_epp(struct amd_cpudata *cpudata, u8 epp)
int ret;
struct cppc_perf_ctrls perf_ctrls;
+ lockdep_assert_held(&cpudata->lock);
+
if (epp == cpudata->epp_cached)
return 0;
@@ -335,6 +338,8 @@ static int amd_pstate_set_energy_pref_index(struct cpufreq_policy *policy,
struct amd_cpudata *cpudata = policy->driver_data;
u8 epp;
+ guard(mutex)(&cpudata->lock);
+
if (!pref_index)
epp = cpudata->epp_default;
else
@@ -750,7 +755,6 @@ static int amd_pstate_set_boost(struct cpufreq_policy *policy, int state)
pr_err("Boost mode is not supported by this processor or SBIOS\n");
return -EOPNOTSUPP;
}
- guard(mutex)(&amd_pstate_driver_lock);
ret = amd_pstate_cpu_boost_update(policy, state);
refresh_frequency_limits(policy);
@@ -973,6 +977,9 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
cpudata->cpu = policy->cpu;
+ mutex_init(&cpudata->lock);
+ guard(mutex)(&cpudata->lock);
+
ret = amd_pstate_init_perf(cpudata);
if (ret)
goto free_cpudata1;
@@ -1179,8 +1186,6 @@ static ssize_t store_energy_performance_preference(
if (ret < 0)
return -EINVAL;
- guard(mutex)(&amd_pstate_limits_lock);
-
ret = amd_pstate_set_energy_pref_index(policy, ret);
return ret ? ret : count;
@@ -1353,8 +1358,10 @@ int amd_pstate_update_status(const char *buf, size_t size)
if (mode_idx < 0 || mode_idx >= AMD_PSTATE_MAX)
return -EINVAL;
- if (mode_state_machine[cppc_state][mode_idx])
+ if (mode_state_machine[cppc_state][mode_idx]) {
+ guard(mutex)(&amd_pstate_driver_lock);
return mode_state_machine[cppc_state][mode_idx](mode_idx);
+ }
return 0;
}
@@ -1375,7 +1382,6 @@ static ssize_t status_store(struct device *a, struct device_attribute *b,
char *p = memchr(buf, '\n', count);
int ret;
- guard(mutex)(&amd_pstate_driver_lock);
ret = amd_pstate_update_status(buf, p ? p - buf : count);
return ret < 0 ? ret : count;
@@ -1472,6 +1478,9 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
cpudata->cpu = policy->cpu;
+ mutex_init(&cpudata->lock);
+ guard(mutex)(&cpudata->lock);
+
ret = amd_pstate_init_perf(cpudata);
if (ret)
goto free_cpudata1;
@@ -1558,6 +1567,8 @@ static int amd_pstate_epp_update_limit(struct cpufreq_policy *policy)
union perf_cached perf;
u8 epp;
+ guard(mutex)(&cpudata->lock);
+
amd_pstate_update_min_max_limit(policy);
if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
@@ -1646,8 +1657,6 @@ static int amd_pstate_epp_cpu_offline(struct cpufreq_policy *policy)
if (cpudata->suspended)
return 0;
- guard(mutex)(&amd_pstate_limits_lock);
-
if (trace_amd_pstate_epp_perf_enabled()) {
trace_amd_pstate_epp_perf(cpudata->cpu, perf.highest_perf,
AMD_CPPC_EPP_BALANCE_POWERSAVE,
@@ -1684,8 +1693,6 @@ static int amd_pstate_epp_resume(struct cpufreq_policy *policy)
struct amd_cpudata *cpudata = policy->driver_data;
if (cpudata->suspended) {
- guard(mutex)(&amd_pstate_limits_lock);
-
/* enable amd pstate from suspend state*/
amd_pstate_epp_reenable(policy);
diff --git a/drivers/cpufreq/amd-pstate.h b/drivers/cpufreq/amd-pstate.h
index a140704b97430..6d776c3e5712a 100644
--- a/drivers/cpufreq/amd-pstate.h
+++ b/drivers/cpufreq/amd-pstate.h
@@ -96,6 +96,8 @@ struct amd_cpudata {
bool boost_supported;
bool hw_prefcore;
+ struct mutex lock;
+
/* EPP feature related attributes*/
u8 epp_cached;
u32 policy;
--
2.43.0
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH 05/14] cpufreq/amd-pstate: Drop `cppc_cap1_cached`
2025-02-06 21:56 [PATCH 00/14] amd-pstate cleanups Mario Limonciello
` (3 preceding siblings ...)
2025-02-06 21:56 ` [PATCH 04/14] cpufreq/amd-pstate: Overhaul locking Mario Limonciello
@ 2025-02-06 21:56 ` Mario Limonciello
2025-02-11 5:46 ` Dhananjay Ugwekar
2025-02-06 21:56 ` [PATCH 06/14] cpufreq/amd-pstate-ut: Use _free macro to free put policy Mario Limonciello
` (8 subsequent siblings)
13 siblings, 1 reply; 41+ messages in thread
From: Mario Limonciello @ 2025-02-06 21:56 UTC (permalink / raw)
To: Gautham R . Shenoy, Perry Yuan
Cc: Dhananjay Ugwekar, open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
From: Mario Limonciello <mario.limonciello@amd.com>
The `cppc_cap1_cached` variable isn't used at all, there is no
need to read it at initialization for each CPU.
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
---
drivers/cpufreq/amd-pstate.c | 5 -----
drivers/cpufreq/amd-pstate.h | 2 --
2 files changed, 7 deletions(-)
diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index dd230ed3b9579..71636bd9884c8 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -1529,11 +1529,6 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
if (ret)
return ret;
WRITE_ONCE(cpudata->cppc_req_cached, value);
-
- ret = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_CAP1, &value);
- if (ret)
- return ret;
- WRITE_ONCE(cpudata->cppc_cap1_cached, value);
}
ret = amd_pstate_set_epp(cpudata, cpudata->epp_default);
if (ret)
diff --git a/drivers/cpufreq/amd-pstate.h b/drivers/cpufreq/amd-pstate.h
index 6d776c3e5712a..7501d30db9953 100644
--- a/drivers/cpufreq/amd-pstate.h
+++ b/drivers/cpufreq/amd-pstate.h
@@ -71,7 +71,6 @@ struct amd_aperf_mperf {
* AMD P-State driver supports preferred core featue.
* @epp_cached: Cached CPPC energy-performance preference value
* @policy: Cpufreq policy value
- * @cppc_cap1_cached Cached MSR_AMD_CPPC_CAP1 register value
*
* The amd_cpudata is key private data for each CPU thread in AMD P-State, and
* represents all the attributes and goals that AMD P-State requests at runtime.
@@ -101,7 +100,6 @@ struct amd_cpudata {
/* EPP feature related attributes*/
u8 epp_cached;
u32 policy;
- u64 cppc_cap1_cached;
bool suspended;
u8 epp_default;
};
--
2.43.0
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH 06/14] cpufreq/amd-pstate-ut: Use _free macro to free put policy
2025-02-06 21:56 [PATCH 00/14] amd-pstate cleanups Mario Limonciello
` (4 preceding siblings ...)
2025-02-06 21:56 ` [PATCH 05/14] cpufreq/amd-pstate: Drop `cppc_cap1_cached` Mario Limonciello
@ 2025-02-06 21:56 ` Mario Limonciello
2025-02-11 5:58 ` Dhananjay Ugwekar
2025-02-06 21:56 ` [PATCH 07/14] cpufreq/amd-pstate: Replace all AMD_CPPC_* macros with masks Mario Limonciello
` (7 subsequent siblings)
13 siblings, 1 reply; 41+ messages in thread
From: Mario Limonciello @ 2025-02-06 21:56 UTC (permalink / raw)
To: Gautham R . Shenoy, Perry Yuan
Cc: Dhananjay Ugwekar, open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
From: Mario Limonciello <mario.limonciello@amd.com>
Using a scoped cleanup macro simplifies cleanup code.
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
---
drivers/cpufreq/amd-pstate-ut.c | 33 ++++++++++++++-------------------
1 file changed, 14 insertions(+), 19 deletions(-)
diff --git a/drivers/cpufreq/amd-pstate-ut.c b/drivers/cpufreq/amd-pstate-ut.c
index d9ab98c6f56b1..adaa62fb2b04e 100644
--- a/drivers/cpufreq/amd-pstate-ut.c
+++ b/drivers/cpufreq/amd-pstate-ut.c
@@ -26,6 +26,7 @@
#include <linux/module.h>
#include <linux/moduleparam.h>
#include <linux/fs.h>
+#include <linux/cleanup.h>
#include <acpi/cppc_acpi.h>
@@ -127,10 +128,11 @@ static void amd_pstate_ut_check_perf(u32 index)
u32 highest_perf = 0, nominal_perf = 0, lowest_nonlinear_perf = 0, lowest_perf = 0;
u64 cap1 = 0;
struct cppc_perf_caps cppc_perf;
- struct cpufreq_policy *policy = NULL;
struct amd_cpudata *cpudata = NULL;
for_each_possible_cpu(cpu) {
+ struct cpufreq_policy *policy __free(put_cpufreq_policy) = NULL;
+
policy = cpufreq_cpu_get(cpu);
if (!policy)
break;
@@ -141,7 +143,7 @@ static void amd_pstate_ut_check_perf(u32 index)
if (ret) {
amd_pstate_ut_cases[index].result = AMD_PSTATE_UT_RESULT_FAIL;
pr_err("%s cppc_get_perf_caps ret=%d error!\n", __func__, ret);
- goto skip_test;
+ return;
}
highest_perf = cppc_perf.highest_perf;
@@ -153,7 +155,7 @@ static void amd_pstate_ut_check_perf(u32 index)
if (ret) {
amd_pstate_ut_cases[index].result = AMD_PSTATE_UT_RESULT_FAIL;
pr_err("%s read CPPC_CAP1 ret=%d error!\n", __func__, ret);
- goto skip_test;
+ return;
}
highest_perf = AMD_CPPC_HIGHEST_PERF(cap1);
@@ -166,7 +168,7 @@ static void amd_pstate_ut_check_perf(u32 index)
!cpudata->hw_prefcore) {
pr_err("%s cpu%d highest=%d %d highest perf doesn't match\n",
__func__, cpu, highest_perf, cpudata->perf.highest_perf);
- goto skip_test;
+ return;
}
if ((nominal_perf != READ_ONCE(cpudata->perf.nominal_perf)) ||
(lowest_nonlinear_perf != READ_ONCE(cpudata->perf.lowest_nonlinear_perf)) ||
@@ -176,7 +178,7 @@ static void amd_pstate_ut_check_perf(u32 index)
__func__, cpu, nominal_perf, cpudata->perf.nominal_perf,
lowest_nonlinear_perf, cpudata->perf.lowest_nonlinear_perf,
lowest_perf, cpudata->perf.lowest_perf);
- goto skip_test;
+ return;
}
if (!((highest_perf >= nominal_perf) &&
@@ -187,15 +189,11 @@ static void amd_pstate_ut_check_perf(u32 index)
pr_err("%s cpu%d highest=%d >= nominal=%d > lowest_nonlinear=%d > lowest=%d > 0, the formula is incorrect!\n",
__func__, cpu, highest_perf, nominal_perf,
lowest_nonlinear_perf, lowest_perf);
- goto skip_test;
+ return;
}
- cpufreq_cpu_put(policy);
}
amd_pstate_ut_cases[index].result = AMD_PSTATE_UT_RESULT_PASS;
- return;
-skip_test:
- cpufreq_cpu_put(policy);
}
/*
@@ -206,10 +204,11 @@ static void amd_pstate_ut_check_perf(u32 index)
static void amd_pstate_ut_check_freq(u32 index)
{
int cpu = 0;
- struct cpufreq_policy *policy = NULL;
struct amd_cpudata *cpudata = NULL;
for_each_possible_cpu(cpu) {
+ struct cpufreq_policy *policy __free(put_cpufreq_policy) = NULL;
+
policy = cpufreq_cpu_get(cpu);
if (!policy)
break;
@@ -223,14 +222,14 @@ static void amd_pstate_ut_check_freq(u32 index)
pr_err("%s cpu%d max=%d >= nominal=%d > lowest_nonlinear=%d > min=%d > 0, the formula is incorrect!\n",
__func__, cpu, policy->cpuinfo.max_freq, cpudata->nominal_freq,
cpudata->lowest_nonlinear_freq, policy->cpuinfo.min_freq);
- goto skip_test;
+ return;
}
if (cpudata->lowest_nonlinear_freq != policy->min) {
amd_pstate_ut_cases[index].result = AMD_PSTATE_UT_RESULT_FAIL;
pr_err("%s cpu%d cpudata_lowest_nonlinear_freq=%d policy_min=%d, they should be equal!\n",
__func__, cpu, cpudata->lowest_nonlinear_freq, policy->min);
- goto skip_test;
+ return;
}
if (cpudata->boost_supported) {
@@ -242,20 +241,16 @@ static void amd_pstate_ut_check_freq(u32 index)
pr_err("%s cpu%d policy_max=%d should be equal cpu_max=%d or cpu_nominal=%d !\n",
__func__, cpu, policy->max, policy->cpuinfo.max_freq,
cpudata->nominal_freq);
- goto skip_test;
+ return;
}
} else {
amd_pstate_ut_cases[index].result = AMD_PSTATE_UT_RESULT_FAIL;
pr_err("%s cpu%d must support boost!\n", __func__, cpu);
- goto skip_test;
+ return;
}
- cpufreq_cpu_put(policy);
}
amd_pstate_ut_cases[index].result = AMD_PSTATE_UT_RESULT_PASS;
- return;
-skip_test:
- cpufreq_cpu_put(policy);
}
static int amd_pstate_set_mode(enum amd_pstate_mode mode)
--
2.43.0
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH 07/14] cpufreq/amd-pstate: Replace all AMD_CPPC_* macros with masks
2025-02-06 21:56 [PATCH 00/14] amd-pstate cleanups Mario Limonciello
` (5 preceding siblings ...)
2025-02-06 21:56 ` [PATCH 06/14] cpufreq/amd-pstate-ut: Use _free macro to free put policy Mario Limonciello
@ 2025-02-06 21:56 ` Mario Limonciello
2025-02-11 6:16 ` Dhananjay Ugwekar
2025-02-06 21:56 ` [PATCH 08/14] cpufreq/amd-pstate: Cache CPPC request in shared mem case too Mario Limonciello
` (6 subsequent siblings)
13 siblings, 1 reply; 41+ messages in thread
From: Mario Limonciello @ 2025-02-06 21:56 UTC (permalink / raw)
To: Gautham R . Shenoy, Perry Yuan
Cc: Dhananjay Ugwekar, open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
From: Mario Limonciello <mario.limonciello@amd.com>
Bitfield masks are easier to follow and less error prone.
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
---
arch/x86/include/asm/msr-index.h | 18 +++++++++---------
arch/x86/kernel/acpi/cppc.c | 2 +-
drivers/cpufreq/amd-pstate-ut.c | 8 ++++----
drivers/cpufreq/amd-pstate.c | 16 ++++++----------
4 files changed, 20 insertions(+), 24 deletions(-)
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 3eadc4d5de837..f77335ebae981 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -700,15 +700,15 @@
#define MSR_AMD_CPPC_REQ 0xc00102b3
#define MSR_AMD_CPPC_STATUS 0xc00102b4
-#define AMD_CPPC_LOWEST_PERF(x) (((x) >> 0) & 0xff)
-#define AMD_CPPC_LOWNONLIN_PERF(x) (((x) >> 8) & 0xff)
-#define AMD_CPPC_NOMINAL_PERF(x) (((x) >> 16) & 0xff)
-#define AMD_CPPC_HIGHEST_PERF(x) (((x) >> 24) & 0xff)
-
-#define AMD_CPPC_MAX_PERF(x) (((x) & 0xff) << 0)
-#define AMD_CPPC_MIN_PERF(x) (((x) & 0xff) << 8)
-#define AMD_CPPC_DES_PERF(x) (((x) & 0xff) << 16)
-#define AMD_CPPC_ENERGY_PERF_PREF(x) (((x) & 0xff) << 24)
+#define AMD_CPPC_LOWEST_PERF_MASK GENMASK(7, 0)
+#define AMD_CPPC_LOWNONLIN_PERF_MASK GENMASK(15, 8)
+#define AMD_CPPC_NOMINAL_PERF_MASK GENMASK(23, 16)
+#define AMD_CPPC_HIGHEST_PERF_MASK GENMASK(31, 24)
+
+#define AMD_CPPC_MAX_PERF_MASK GENMASK(7, 0)
+#define AMD_CPPC_MIN_PERF_MASK GENMASK(15, 8)
+#define AMD_CPPC_DES_PERF_MASK GENMASK(23, 16)
+#define AMD_CPPC_EPP_PERF_MASK GENMASK(31, 24)
/* AMD Performance Counter Global Status and Control MSRs */
#define MSR_AMD64_PERF_CNTR_GLOBAL_STATUS 0xc0000300
diff --git a/arch/x86/kernel/acpi/cppc.c b/arch/x86/kernel/acpi/cppc.c
index d745dd586303c..d68a4cb0168fa 100644
--- a/arch/x86/kernel/acpi/cppc.c
+++ b/arch/x86/kernel/acpi/cppc.c
@@ -149,7 +149,7 @@ int amd_get_highest_perf(unsigned int cpu, u32 *highest_perf)
if (ret)
goto out;
- val = AMD_CPPC_HIGHEST_PERF(val);
+ val = FIELD_GET(AMD_CPPC_HIGHEST_PERF_MASK, val);
} else {
ret = cppc_get_highest_perf(cpu, &val);
if (ret)
diff --git a/drivers/cpufreq/amd-pstate-ut.c b/drivers/cpufreq/amd-pstate-ut.c
index adaa62fb2b04e..2595faa492bf1 100644
--- a/drivers/cpufreq/amd-pstate-ut.c
+++ b/drivers/cpufreq/amd-pstate-ut.c
@@ -158,10 +158,10 @@ static void amd_pstate_ut_check_perf(u32 index)
return;
}
- highest_perf = AMD_CPPC_HIGHEST_PERF(cap1);
- nominal_perf = AMD_CPPC_NOMINAL_PERF(cap1);
- lowest_nonlinear_perf = AMD_CPPC_LOWNONLIN_PERF(cap1);
- lowest_perf = AMD_CPPC_LOWEST_PERF(cap1);
+ highest_perf = FIELD_GET(AMD_CPPC_HIGHEST_PERF_MASK, cap1);
+ nominal_perf = FIELD_GET(AMD_CPPC_NOMINAL_PERF_MASK, cap1);
+ lowest_nonlinear_perf = FIELD_GET(AMD_CPPC_LOWNONLIN_PERF_MASK, cap1);
+ lowest_perf = FIELD_GET(AMD_CPPC_LOWEST_PERF_MASK, cap1);
}
if (highest_perf != READ_ONCE(cpudata->perf.highest_perf) &&
diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 71636bd9884c8..cd96443fc117f 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -89,11 +89,6 @@ static bool cppc_enabled;
static bool amd_pstate_prefcore = true;
static struct quirk_entry *quirks;
-#define AMD_CPPC_MAX_PERF_MASK GENMASK(7, 0)
-#define AMD_CPPC_MIN_PERF_MASK GENMASK(15, 8)
-#define AMD_CPPC_DES_PERF_MASK GENMASK(23, 16)
-#define AMD_CPPC_EPP_PERF_MASK GENMASK(31, 24)
-
/*
* AMD Energy Preference Performance (EPP)
* The EPP is used in the CCLK DPM controller to drive
@@ -445,12 +440,13 @@ static int msr_init_perf(struct amd_cpudata *cpudata)
perf.highest_perf = numerator;
perf.max_limit_perf = numerator;
- perf.min_limit_perf = AMD_CPPC_LOWEST_PERF(cap1);
- perf.nominal_perf = AMD_CPPC_NOMINAL_PERF(cap1);
- perf.lowest_nonlinear_perf = AMD_CPPC_LOWNONLIN_PERF(cap1);
- perf.lowest_perf = AMD_CPPC_LOWEST_PERF(cap1);
+ perf.min_limit_perf = FIELD_GET(AMD_CPPC_LOWEST_PERF_MASK, cap1);
+ perf.nominal_perf = FIELD_GET(AMD_CPPC_NOMINAL_PERF_MASK, cap1);
+ perf.lowest_nonlinear_perf = FIELD_GET(AMD_CPPC_LOWNONLIN_PERF_MASK, cap1);
+ perf.lowest_perf = FIELD_GET(AMD_CPPC_LOWEST_PERF_MASK, cap1);
WRITE_ONCE(cpudata->perf, perf);
- WRITE_ONCE(cpudata->prefcore_ranking, AMD_CPPC_HIGHEST_PERF(cap1));
+ WRITE_ONCE(cpudata->prefcore_ranking, FIELD_GET(AMD_CPPC_HIGHEST_PERF_MASK, cap1));
+
return 0;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH 08/14] cpufreq/amd-pstate: Cache CPPC request in shared mem case too
2025-02-06 21:56 [PATCH 00/14] amd-pstate cleanups Mario Limonciello
` (6 preceding siblings ...)
2025-02-06 21:56 ` [PATCH 07/14] cpufreq/amd-pstate: Replace all AMD_CPPC_* macros with masks Mario Limonciello
@ 2025-02-06 21:56 ` Mario Limonciello
2025-02-11 9:18 ` Dhananjay Ugwekar
2025-02-06 21:56 ` [PATCH 09/14] cpufreq/amd-pstate: Move all EPP tracing into *_update_perf and *_set_epp functions Mario Limonciello
` (5 subsequent siblings)
13 siblings, 1 reply; 41+ messages in thread
From: Mario Limonciello @ 2025-02-06 21:56 UTC (permalink / raw)
To: Gautham R . Shenoy, Perry Yuan
Cc: Dhananjay Ugwekar, open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
From: Mario Limonciello <mario.limonciello@amd.com>
In order to prevent a potential write for shmem_update_perf()
cache the request into the cppc_req_cached variable normally only
used for the MSR case.
This adds symmetry into the code and potentially avoids extra writes.
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
---
drivers/cpufreq/amd-pstate.c | 22 +++++++++++++++++++++-
1 file changed, 21 insertions(+), 1 deletion(-)
diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index cd96443fc117f..2aa3d5be2efe5 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -502,6 +502,8 @@ static int shmem_update_perf(struct amd_cpudata *cpudata, u8 min_perf,
u8 des_perf, u8 max_perf, u8 epp, bool fast_switch)
{
struct cppc_perf_ctrls perf_ctrls;
+ u64 value, prev;
+ int ret;
if (cppc_state == AMD_PSTATE_ACTIVE) {
int ret = shmem_set_epp(cpudata, epp);
@@ -510,11 +512,29 @@ static int shmem_update_perf(struct amd_cpudata *cpudata, u8 min_perf,
return ret;
}
+ value = prev = READ_ONCE(cpudata->cppc_req_cached);
+
+ value &= ~(AMD_CPPC_MAX_PERF_MASK | AMD_CPPC_MIN_PERF_MASK |
+ AMD_CPPC_DES_PERF_MASK | AMD_CPPC_EPP_PERF_MASK);
+ value |= FIELD_PREP(AMD_CPPC_MAX_PERF_MASK, max_perf);
+ value |= FIELD_PREP(AMD_CPPC_DES_PERF_MASK, des_perf);
+ value |= FIELD_PREP(AMD_CPPC_MIN_PERF_MASK, min_perf);
+ value |= FIELD_PREP(AMD_CPPC_EPP_PERF_MASK, epp);
+
+ if (value == prev)
+ return 0;
+
perf_ctrls.max_perf = max_perf;
perf_ctrls.min_perf = min_perf;
perf_ctrls.desired_perf = des_perf;
- return cppc_set_perf(cpudata->cpu, &perf_ctrls);
+ ret = cppc_set_perf(cpudata->cpu, &perf_ctrls);
+ if (ret)
+ return ret;
+
+ WRITE_ONCE(cpudata->cppc_req_cached, value);
+
+ return 0;
}
static inline bool amd_pstate_sample(struct amd_cpudata *cpudata)
--
2.43.0
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH 09/14] cpufreq/amd-pstate: Move all EPP tracing into *_update_perf and *_set_epp functions
2025-02-06 21:56 [PATCH 00/14] amd-pstate cleanups Mario Limonciello
` (7 preceding siblings ...)
2025-02-06 21:56 ` [PATCH 08/14] cpufreq/amd-pstate: Cache CPPC request in shared mem case too Mario Limonciello
@ 2025-02-06 21:56 ` Mario Limonciello
2025-02-12 6:39 ` Dhananjay Ugwekar
2025-02-06 21:56 ` [PATCH 10/14] cpufreq/amd-pstate: Update cppc_req_cached for shared mem EPP writes Mario Limonciello
` (4 subsequent siblings)
13 siblings, 1 reply; 41+ messages in thread
From: Mario Limonciello @ 2025-02-06 21:56 UTC (permalink / raw)
To: Gautham R . Shenoy, Perry Yuan
Cc: Dhananjay Ugwekar, open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
From: Mario Limonciello <mario.limonciello@amd.com>
The EPP tracing is done by the caller today, but this precludes the
information about whether the CPPC request has changed.
Move it into the update_perf and set_epp functions and include information
about whether the request has changed from the last one.
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
---
drivers/cpufreq/amd-pstate-trace.h | 13 +++-
drivers/cpufreq/amd-pstate.c | 119 ++++++++++++++++++-----------
2 files changed, 83 insertions(+), 49 deletions(-)
diff --git a/drivers/cpufreq/amd-pstate-trace.h b/drivers/cpufreq/amd-pstate-trace.h
index f457d4af2c62e..32e1bdc588c52 100644
--- a/drivers/cpufreq/amd-pstate-trace.h
+++ b/drivers/cpufreq/amd-pstate-trace.h
@@ -90,7 +90,8 @@ TRACE_EVENT(amd_pstate_epp_perf,
u8 epp,
u8 min_perf,
u8 max_perf,
- bool boost
+ bool boost,
+ bool changed
),
TP_ARGS(cpu_id,
@@ -98,7 +99,8 @@ TRACE_EVENT(amd_pstate_epp_perf,
epp,
min_perf,
max_perf,
- boost),
+ boost,
+ changed),
TP_STRUCT__entry(
__field(unsigned int, cpu_id)
@@ -107,6 +109,7 @@ TRACE_EVENT(amd_pstate_epp_perf,
__field(u8, min_perf)
__field(u8, max_perf)
__field(bool, boost)
+ __field(bool, changed)
),
TP_fast_assign(
@@ -116,15 +119,17 @@ TRACE_EVENT(amd_pstate_epp_perf,
__entry->min_perf = min_perf;
__entry->max_perf = max_perf;
__entry->boost = boost;
+ __entry->changed = changed;
),
- TP_printk("cpu%u: [%hhu<->%hhu]/%hhu, epp=%hhu, boost=%u",
+ TP_printk("cpu%u: [%hhu<->%hhu]/%hhu, epp=%hhu, boost=%u, changed=%u",
(unsigned int)__entry->cpu_id,
(u8)__entry->min_perf,
(u8)__entry->max_perf,
(u8)__entry->highest_perf,
(u8)__entry->epp,
- (bool)__entry->boost
+ (bool)__entry->boost,
+ (bool)__entry->changed
)
);
diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 2aa3d5be2efe5..e66ccfce5893f 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -228,9 +228,10 @@ static u8 shmem_get_epp(struct amd_cpudata *cpudata)
return FIELD_GET(AMD_CPPC_EPP_PERF_MASK, epp);
}
-static int msr_update_perf(struct amd_cpudata *cpudata, u8 min_perf,
+static int msr_update_perf(struct cpufreq_policy *policy, u8 min_perf,
u8 des_perf, u8 max_perf, u8 epp, bool fast_switch)
{
+ struct amd_cpudata *cpudata = policy->driver_data;
u64 value, prev;
value = prev = READ_ONCE(cpudata->cppc_req_cached);
@@ -242,6 +243,18 @@ static int msr_update_perf(struct amd_cpudata *cpudata, u8 min_perf,
value |= FIELD_PREP(AMD_CPPC_MIN_PERF_MASK, min_perf);
value |= FIELD_PREP(AMD_CPPC_EPP_PERF_MASK, epp);
+ if (trace_amd_pstate_epp_perf_enabled()) {
+ union perf_cached perf = cpudata->perf;
+
+ trace_amd_pstate_epp_perf(cpudata->cpu,
+ perf.highest_perf,
+ epp,
+ min_perf,
+ max_perf,
+ policy->boost_enabled,
+ value != prev);
+ }
+
if (value == prev)
return 0;
@@ -256,24 +269,26 @@ static int msr_update_perf(struct amd_cpudata *cpudata, u8 min_perf,
}
WRITE_ONCE(cpudata->cppc_req_cached, value);
- WRITE_ONCE(cpudata->epp_cached, epp);
+ if (epp != cpudata->epp_cached)
+ WRITE_ONCE(cpudata->epp_cached, epp);
return 0;
}
DEFINE_STATIC_CALL(amd_pstate_update_perf, msr_update_perf);
-static inline int amd_pstate_update_perf(struct amd_cpudata *cpudata,
+static inline int amd_pstate_update_perf(struct cpufreq_policy *policy,
u8 min_perf, u8 des_perf,
u8 max_perf, u8 epp,
bool fast_switch)
{
- return static_call(amd_pstate_update_perf)(cpudata, min_perf, des_perf,
+ return static_call(amd_pstate_update_perf)(policy, min_perf, des_perf,
max_perf, epp, fast_switch);
}
-static int msr_set_epp(struct amd_cpudata *cpudata, u8 epp)
+static int msr_set_epp(struct cpufreq_policy *policy, u8 epp)
{
+ struct amd_cpudata *cpudata = policy->driver_data;
u64 value, prev;
int ret;
@@ -283,6 +298,19 @@ static int msr_set_epp(struct amd_cpudata *cpudata, u8 epp)
value &= ~AMD_CPPC_EPP_PERF_MASK;
value |= FIELD_PREP(AMD_CPPC_EPP_PERF_MASK, epp);
+ if (trace_amd_pstate_epp_perf_enabled()) {
+ union perf_cached perf = cpudata->perf;
+
+ trace_amd_pstate_epp_perf(cpudata->cpu, perf.highest_perf,
+ epp,
+ FIELD_GET(AMD_CPPC_MIN_PERF_MASK,
+ cpudata->cppc_req_cached),
+ FIELD_GET(AMD_CPPC_MAX_PERF_MASK,
+ cpudata->cppc_req_cached),
+ policy->boost_enabled,
+ value != prev);
+ }
+
if (value == prev)
return 0;
@@ -301,18 +329,32 @@ static int msr_set_epp(struct amd_cpudata *cpudata, u8 epp)
DEFINE_STATIC_CALL(amd_pstate_set_epp, msr_set_epp);
-static inline int amd_pstate_set_epp(struct amd_cpudata *cpudata, u8 epp)
+static inline int amd_pstate_set_epp(struct cpufreq_policy *policy, u8 epp)
{
- return static_call(amd_pstate_set_epp)(cpudata, epp);
+ return static_call(amd_pstate_set_epp)(policy, epp);
}
-static int shmem_set_epp(struct amd_cpudata *cpudata, u8 epp)
+static int shmem_set_epp(struct cpufreq_policy *policy, u8 epp)
{
- int ret;
+ struct amd_cpudata *cpudata = policy->driver_data;
struct cppc_perf_ctrls perf_ctrls;
+ int ret;
lockdep_assert_held(&cpudata->lock);
+ if (trace_amd_pstate_epp_perf_enabled()) {
+ union perf_cached perf = cpudata->perf;
+
+ trace_amd_pstate_epp_perf(cpudata->cpu, perf.highest_perf,
+ epp,
+ FIELD_GET(AMD_CPPC_MIN_PERF_MASK,
+ cpudata->cppc_req_cached),
+ FIELD_GET(AMD_CPPC_MAX_PERF_MASK,
+ cpudata->cppc_req_cached),
+ policy->boost_enabled,
+ epp != cpudata->epp_cached);
+ }
+
if (epp == cpudata->epp_cached)
return 0;
@@ -345,17 +387,7 @@ static int amd_pstate_set_energy_pref_index(struct cpufreq_policy *policy,
return -EBUSY;
}
- if (trace_amd_pstate_epp_perf_enabled()) {
- union perf_cached perf = cpudata->perf;
-
- trace_amd_pstate_epp_perf(cpudata->cpu, perf.highest_perf,
- epp,
- FIELD_GET(AMD_CPPC_MIN_PERF_MASK, cpudata->cppc_req_cached),
- FIELD_GET(AMD_CPPC_MAX_PERF_MASK, cpudata->cppc_req_cached),
- policy->boost_enabled);
- }
-
- return amd_pstate_set_epp(cpudata, epp);
+ return amd_pstate_set_epp(policy, epp);
}
static inline int msr_cppc_enable(bool enable)
@@ -498,15 +530,16 @@ static inline int amd_pstate_init_perf(struct amd_cpudata *cpudata)
return static_call(amd_pstate_init_perf)(cpudata);
}
-static int shmem_update_perf(struct amd_cpudata *cpudata, u8 min_perf,
+static int shmem_update_perf(struct cpufreq_policy *policy, u8 min_perf,
u8 des_perf, u8 max_perf, u8 epp, bool fast_switch)
{
+ struct amd_cpudata *cpudata = policy->driver_data;
struct cppc_perf_ctrls perf_ctrls;
u64 value, prev;
int ret;
if (cppc_state == AMD_PSTATE_ACTIVE) {
- int ret = shmem_set_epp(cpudata, epp);
+ int ret = shmem_set_epp(policy, epp);
if (ret)
return ret;
@@ -521,6 +554,18 @@ static int shmem_update_perf(struct amd_cpudata *cpudata, u8 min_perf,
value |= FIELD_PREP(AMD_CPPC_MIN_PERF_MASK, min_perf);
value |= FIELD_PREP(AMD_CPPC_EPP_PERF_MASK, epp);
+ if (trace_amd_pstate_epp_perf_enabled()) {
+ union perf_cached perf = cpudata->perf;
+
+ trace_amd_pstate_epp_perf(cpudata->cpu,
+ perf.highest_perf,
+ epp,
+ min_perf,
+ max_perf,
+ policy->boost_enabled,
+ value != prev);
+ }
+
if (value == prev)
return 0;
@@ -598,7 +643,7 @@ static void amd_pstate_update(struct amd_cpudata *cpudata, u8 min_perf,
cpudata->cpu, fast_switch);
}
- amd_pstate_update_perf(cpudata, min_perf, des_perf, max_perf, 0, fast_switch);
+ amd_pstate_update_perf(policy, min_perf, des_perf, max_perf, 0, fast_switch);
}
static int amd_pstate_verify(struct cpufreq_policy_data *policy_data)
@@ -1546,7 +1591,7 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
return ret;
WRITE_ONCE(cpudata->cppc_req_cached, value);
}
- ret = amd_pstate_set_epp(cpudata, cpudata->epp_default);
+ ret = amd_pstate_set_epp(policy, cpudata->epp_default);
if (ret)
return ret;
@@ -1588,14 +1633,8 @@ static int amd_pstate_epp_update_limit(struct cpufreq_policy *policy)
epp = READ_ONCE(cpudata->epp_cached);
perf = READ_ONCE(cpudata->perf);
- if (trace_amd_pstate_epp_perf_enabled()) {
- trace_amd_pstate_epp_perf(cpudata->cpu, perf.highest_perf, epp,
- perf.min_limit_perf,
- perf.max_limit_perf,
- policy->boost_enabled);
- }
- return amd_pstate_update_perf(cpudata, perf.min_limit_perf, 0U,
+ return amd_pstate_update_perf(policy, perf.min_limit_perf, 0U,
perf.max_limit_perf, epp, false);
}
@@ -1635,14 +1674,9 @@ static int amd_pstate_epp_reenable(struct cpufreq_policy *policy)
if (ret)
pr_err("failed to enable amd pstate during resume, return %d\n", ret);
- if (trace_amd_pstate_epp_perf_enabled()) {
- trace_amd_pstate_epp_perf(cpudata->cpu, perf.highest_perf,
- cpudata->epp_cached,
- FIELD_GET(AMD_CPPC_MIN_PERF_MASK, cpudata->cppc_req_cached),
- perf.highest_perf, policy->boost_enabled);
- }
+ guard(mutex)(&cpudata->lock);
- return amd_pstate_update_perf(cpudata, 0, 0, perf.highest_perf, cpudata->epp_cached, false);
+ return amd_pstate_update_perf(policy, 0, 0, perf.highest_perf, cpudata->epp_cached, false);
}
static int amd_pstate_epp_cpu_online(struct cpufreq_policy *policy)
@@ -1668,14 +1702,9 @@ static int amd_pstate_epp_cpu_offline(struct cpufreq_policy *policy)
if (cpudata->suspended)
return 0;
- if (trace_amd_pstate_epp_perf_enabled()) {
- trace_amd_pstate_epp_perf(cpudata->cpu, perf.highest_perf,
- AMD_CPPC_EPP_BALANCE_POWERSAVE,
- perf.lowest_perf, perf.lowest_perf,
- policy->boost_enabled);
- }
+ guard(mutex)(&cpudata->lock);
- return amd_pstate_update_perf(cpudata, perf.lowest_perf, 0, perf.lowest_perf,
+ return amd_pstate_update_perf(policy, perf.lowest_perf, 0, perf.lowest_perf,
AMD_CPPC_EPP_BALANCE_POWERSAVE, false);
}
--
2.43.0
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH 10/14] cpufreq/amd-pstate: Update cppc_req_cached for shared mem EPP writes
2025-02-06 21:56 [PATCH 00/14] amd-pstate cleanups Mario Limonciello
` (8 preceding siblings ...)
2025-02-06 21:56 ` [PATCH 09/14] cpufreq/amd-pstate: Move all EPP tracing into *_update_perf and *_set_epp functions Mario Limonciello
@ 2025-02-06 21:56 ` Mario Limonciello
2025-02-11 13:01 ` Dhananjay Ugwekar
2025-02-06 21:56 ` [PATCH 11/14] cpufreq/amd-pstate: Drop debug statements for policy setting Mario Limonciello
` (3 subsequent siblings)
13 siblings, 1 reply; 41+ messages in thread
From: Mario Limonciello @ 2025-02-06 21:56 UTC (permalink / raw)
To: Gautham R . Shenoy, Perry Yuan
Cc: Dhananjay Ugwekar, open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
From: Mario Limonciello <mario.limonciello@amd.com>
On EPP only writes update the cached variable so that the min/max
performance controls don't need to be updated again.
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
---
drivers/cpufreq/amd-pstate.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index e66ccfce5893f..754f2d606b371 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -338,6 +338,7 @@ static int shmem_set_epp(struct cpufreq_policy *policy, u8 epp)
{
struct amd_cpudata *cpudata = policy->driver_data;
struct cppc_perf_ctrls perf_ctrls;
+ u64 value;
int ret;
lockdep_assert_held(&cpudata->lock);
@@ -366,6 +367,11 @@ static int shmem_set_epp(struct cpufreq_policy *policy, u8 epp)
}
WRITE_ONCE(cpudata->epp_cached, epp);
+ value = READ_ONCE(cpudata->cppc_req_cached);
+ value &= ~AMD_CPPC_EPP_PERF_MASK;
+ value |= FIELD_PREP(AMD_CPPC_EPP_PERF_MASK, epp);
+ WRITE_ONCE(cpudata->cppc_req_cached, value);
+
return ret;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH 11/14] cpufreq/amd-pstate: Drop debug statements for policy setting
2025-02-06 21:56 [PATCH 00/14] amd-pstate cleanups Mario Limonciello
` (9 preceding siblings ...)
2025-02-06 21:56 ` [PATCH 10/14] cpufreq/amd-pstate: Update cppc_req_cached for shared mem EPP writes Mario Limonciello
@ 2025-02-06 21:56 ` Mario Limonciello
2025-02-11 13:03 ` Dhananjay Ugwekar
2025-02-06 21:56 ` [PATCH 12/14] cpufreq/amd-pstate: Cache a pointer to policy in cpudata Mario Limonciello
` (2 subsequent siblings)
13 siblings, 1 reply; 41+ messages in thread
From: Mario Limonciello @ 2025-02-06 21:56 UTC (permalink / raw)
To: Gautham R . Shenoy, Perry Yuan
Cc: Dhananjay Ugwekar, open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
From: Mario Limonciello <mario.limonciello@amd.com>
There are trace events that exist now for all amd-pstate modes that
will output information right before programming to the hardware.
This makes the existing debug statements unnecessary remaining
overhead. Drop them.
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
---
drivers/cpufreq/amd-pstate.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 754f2d606b371..689de385d06da 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -673,7 +673,6 @@ static int amd_pstate_verify(struct cpufreq_policy_data *policy_data)
}
cpufreq_verify_within_cpu_limits(policy_data);
- pr_debug("policy_max =%d, policy_min=%d\n", policy_data->max, policy_data->min);
return 0;
}
@@ -1652,9 +1651,6 @@ static int amd_pstate_epp_set_policy(struct cpufreq_policy *policy)
if (!policy->cpuinfo.max_freq)
return -ENODEV;
- pr_debug("set_policy: cpuinfo.max %u policy->max %u\n",
- policy->cpuinfo.max_freq, policy->max);
-
cpudata->policy = policy->policy;
ret = amd_pstate_epp_update_limit(policy);
--
2.43.0
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH 12/14] cpufreq/amd-pstate: Cache a pointer to policy in cpudata
2025-02-06 21:56 [PATCH 00/14] amd-pstate cleanups Mario Limonciello
` (10 preceding siblings ...)
2025-02-06 21:56 ` [PATCH 11/14] cpufreq/amd-pstate: Drop debug statements for policy setting Mario Limonciello
@ 2025-02-06 21:56 ` Mario Limonciello
2025-02-11 13:13 ` Dhananjay Ugwekar
2025-02-06 21:56 ` [PATCH 13/14] cpufreq/amd-pstate: Rework CPPC enabling Mario Limonciello
2025-02-06 21:56 ` [PATCH 14/14] cpufreq/amd-pstate: Stop caching EPP Mario Limonciello
13 siblings, 1 reply; 41+ messages in thread
From: Mario Limonciello @ 2025-02-06 21:56 UTC (permalink / raw)
To: Gautham R . Shenoy, Perry Yuan
Cc: Dhananjay Ugwekar, open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
From: Mario Limonciello <mario.limonciello@amd.com>
In order to access the policy from a notification block it will
need to be stored in cpudata.
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
---
drivers/cpufreq/amd-pstate.c | 13 +++++++------
drivers/cpufreq/amd-pstate.h | 3 ++-
2 files changed, 9 insertions(+), 7 deletions(-)
diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 689de385d06da..5945b6c7f7e56 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -388,7 +388,7 @@ static int amd_pstate_set_energy_pref_index(struct cpufreq_policy *policy,
else
epp = epp_values[pref_index];
- if (epp > 0 && cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
+ if (epp > 0 && policy->policy == CPUFREQ_POLICY_PERFORMANCE) {
pr_debug("EPP cannot be set under performance policy\n");
return -EBUSY;
}
@@ -689,7 +689,7 @@ static void amd_pstate_update_min_max_limit(struct cpufreq_policy *policy)
perf.max_limit_perf = freq_to_perf(perf, cpudata->nominal_freq, policy->max);
perf.min_limit_perf = freq_to_perf(perf, cpudata->nominal_freq, policy->min);
- if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
+ if (policy->policy == CPUFREQ_POLICY_PERFORMANCE)
perf.min_limit_perf = min(perf.nominal_perf, perf.max_limit_perf);
WRITE_ONCE(cpudata->perf, perf);
@@ -1042,6 +1042,7 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
return -ENOMEM;
cpudata->cpu = policy->cpu;
+ cpudata->policy = policy;
mutex_init(&cpudata->lock);
guard(mutex)(&cpudata->lock);
@@ -1224,9 +1225,8 @@ static ssize_t show_energy_performance_available_preferences(
{
int i = 0;
int offset = 0;
- struct amd_cpudata *cpudata = policy->driver_data;
- if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
+ if (policy->policy == CPUFREQ_POLICY_PERFORMANCE)
return sysfs_emit_at(buf, offset, "%s\n",
energy_perf_strings[EPP_INDEX_PERFORMANCE]);
@@ -1543,6 +1543,7 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
return -ENOMEM;
cpudata->cpu = policy->cpu;
+ cpudata->policy = policy;
mutex_init(&cpudata->lock);
guard(mutex)(&cpudata->lock);
@@ -1632,7 +1633,7 @@ static int amd_pstate_epp_update_limit(struct cpufreq_policy *policy)
amd_pstate_update_min_max_limit(policy);
- if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
+ if (policy->policy == CPUFREQ_POLICY_PERFORMANCE)
epp = 0;
else
epp = READ_ONCE(cpudata->epp_cached);
@@ -1651,7 +1652,7 @@ static int amd_pstate_epp_set_policy(struct cpufreq_policy *policy)
if (!policy->cpuinfo.max_freq)
return -ENODEV;
- cpudata->policy = policy->policy;
+ cpudata->policy = policy;
ret = amd_pstate_epp_update_limit(policy);
if (ret)
diff --git a/drivers/cpufreq/amd-pstate.h b/drivers/cpufreq/amd-pstate.h
index 7501d30db9953..16ce631a6c3d5 100644
--- a/drivers/cpufreq/amd-pstate.h
+++ b/drivers/cpufreq/amd-pstate.h
@@ -97,9 +97,10 @@ struct amd_cpudata {
struct mutex lock;
+ struct cpufreq_policy *policy;
+
/* EPP feature related attributes*/
u8 epp_cached;
- u32 policy;
bool suspended;
u8 epp_default;
};
--
2.43.0
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH 13/14] cpufreq/amd-pstate: Rework CPPC enabling
2025-02-06 21:56 [PATCH 00/14] amd-pstate cleanups Mario Limonciello
` (11 preceding siblings ...)
2025-02-06 21:56 ` [PATCH 12/14] cpufreq/amd-pstate: Cache a pointer to policy in cpudata Mario Limonciello
@ 2025-02-06 21:56 ` Mario Limonciello
2025-02-13 4:42 ` Dhananjay Ugwekar
2025-02-06 21:56 ` [PATCH 14/14] cpufreq/amd-pstate: Stop caching EPP Mario Limonciello
13 siblings, 1 reply; 41+ messages in thread
From: Mario Limonciello @ 2025-02-06 21:56 UTC (permalink / raw)
To: Gautham R . Shenoy, Perry Yuan
Cc: Dhananjay Ugwekar, open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
From: Mario Limonciello <mario.limonciello@amd.com>
The CPPC enable register is configured as "write once". That is
any future writes don't actually do anything.
Because of this, all the cleanup paths that currently exist for
CPPC disable are non-effective.
Rework CPPC enable to only enable after all the CAP registers have
been read to avoid enabling CPPC on CPUs with invalid _CPC or
unpopulated MSRs.
As the register is write once, remove all cleanup paths as well.
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
---
drivers/cpufreq/amd-pstate.c | 188 +++++++++++------------------------
1 file changed, 59 insertions(+), 129 deletions(-)
diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 5945b6c7f7e56..697fa1b80cf24 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -85,7 +85,6 @@ static struct cpufreq_driver *current_pstate_driver;
static struct cpufreq_driver amd_pstate_driver;
static struct cpufreq_driver amd_pstate_epp_driver;
static int cppc_state = AMD_PSTATE_UNDEFINED;
-static bool cppc_enabled;
static bool amd_pstate_prefcore = true;
static struct quirk_entry *quirks;
@@ -375,91 +374,40 @@ static int shmem_set_epp(struct cpufreq_policy *policy, u8 epp)
return ret;
}
-static int amd_pstate_set_energy_pref_index(struct cpufreq_policy *policy,
- int pref_index)
+static inline int msr_cppc_enable(struct cpufreq_policy *policy)
{
struct amd_cpudata *cpudata = policy->driver_data;
- u8 epp;
-
- guard(mutex)(&cpudata->lock);
- if (!pref_index)
- epp = cpudata->epp_default;
- else
- epp = epp_values[pref_index];
-
- if (epp > 0 && policy->policy == CPUFREQ_POLICY_PERFORMANCE) {
- pr_debug("EPP cannot be set under performance policy\n");
- return -EBUSY;
- }
-
- return amd_pstate_set_epp(policy, epp);
+ return wrmsrl_safe_on_cpu(cpudata->cpu, MSR_AMD_CPPC_ENABLE, 1);
}
-static inline int msr_cppc_enable(bool enable)
+static int shmem_cppc_enable(struct cpufreq_policy *policy)
{
- int ret, cpu;
- unsigned long logical_proc_id_mask = 0;
-
- /*
- * MSR_AMD_CPPC_ENABLE is write-once, once set it cannot be cleared.
- */
- if (!enable)
- return 0;
-
- if (enable == cppc_enabled)
- return 0;
-
- for_each_present_cpu(cpu) {
- unsigned long logical_id = topology_logical_package_id(cpu);
-
- if (test_bit(logical_id, &logical_proc_id_mask))
- continue;
-
- set_bit(logical_id, &logical_proc_id_mask);
-
- ret = wrmsrl_safe_on_cpu(cpu, MSR_AMD_CPPC_ENABLE,
- enable);
- if (ret)
- return ret;
- }
-
- cppc_enabled = enable;
- return 0;
-}
-
-static int shmem_cppc_enable(bool enable)
-{
- int cpu, ret = 0;
+ struct amd_cpudata *cpudata = policy->driver_data;
struct cppc_perf_ctrls perf_ctrls;
+ int ret;
- if (enable == cppc_enabled)
- return 0;
+ ret = cppc_set_enable(cpudata->cpu, 1);
+ if (ret)
+ return ret;
- for_each_present_cpu(cpu) {
- ret = cppc_set_enable(cpu, enable);
+ /* Enable autonomous mode for EPP */
+ if (cppc_state == AMD_PSTATE_ACTIVE) {
+ /* Set desired perf as zero to allow EPP firmware control */
+ perf_ctrls.desired_perf = 0;
+ ret = cppc_set_perf(cpudata->cpu, &perf_ctrls);
if (ret)
return ret;
-
- /* Enable autonomous mode for EPP */
- if (cppc_state == AMD_PSTATE_ACTIVE) {
- /* Set desired perf as zero to allow EPP firmware control */
- perf_ctrls.desired_perf = 0;
- ret = cppc_set_perf(cpu, &perf_ctrls);
- if (ret)
- return ret;
- }
}
- cppc_enabled = enable;
return ret;
}
DEFINE_STATIC_CALL(amd_pstate_cppc_enable, msr_cppc_enable);
-static inline int amd_pstate_cppc_enable(bool enable)
+static inline int amd_pstate_cppc_enable(struct cpufreq_policy *policy)
{
- return static_call(amd_pstate_cppc_enable)(enable);
+ return static_call(amd_pstate_cppc_enable)(policy);
}
static int msr_init_perf(struct amd_cpudata *cpudata)
@@ -1122,24 +1070,7 @@ static void amd_pstate_cpu_exit(struct cpufreq_policy *policy)
static int amd_pstate_cpu_resume(struct cpufreq_policy *policy)
{
- int ret;
-
- ret = amd_pstate_cppc_enable(true);
- if (ret)
- pr_err("failed to enable amd-pstate during resume, return %d\n", ret);
-
- return ret;
-}
-
-static int amd_pstate_cpu_suspend(struct cpufreq_policy *policy)
-{
- int ret;
-
- ret = amd_pstate_cppc_enable(false);
- if (ret)
- pr_err("failed to disable amd-pstate during suspend, return %d\n", ret);
-
- return ret;
+ return amd_pstate_cppc_enable(policy);
}
/* Sysfs attributes */
@@ -1241,8 +1172,10 @@ static ssize_t show_energy_performance_available_preferences(
static ssize_t store_energy_performance_preference(
struct cpufreq_policy *policy, const char *buf, size_t count)
{
+ struct amd_cpudata *cpudata = policy->driver_data;
char str_preference[21];
ssize_t ret;
+ u8 epp;
ret = sscanf(buf, "%20s", str_preference);
if (ret != 1)
@@ -1252,7 +1185,31 @@ static ssize_t store_energy_performance_preference(
if (ret < 0)
return -EINVAL;
- ret = amd_pstate_set_energy_pref_index(policy, ret);
+ if (!ret)
+ epp = cpudata->epp_default;
+ else
+ epp = epp_values[ret];
+
+ if (epp > 0 && policy->policy == CPUFREQ_POLICY_PERFORMANCE) {
+ pr_debug("EPP cannot be set under performance policy\n");
+ return -EBUSY;
+ }
+
+ if (trace_amd_pstate_epp_perf_enabled()) {
+ union perf_cached perf = cpudata->perf;
+
+ trace_amd_pstate_epp_perf(cpudata->cpu, perf.highest_perf,
+ epp,
+ FIELD_GET(AMD_CPPC_MIN_PERF_MASK, cpudata->cppc_req_cached),
+ FIELD_GET(AMD_CPPC_MAX_PERF_MASK, cpudata->cppc_req_cached),
+ policy->boost_enabled,
+ FIELD_GET(AMD_CPPC_EPP_PERF_MASK,
+ cpudata->cppc_req_cached) != epp);
+ }
+
+ guard(mutex)(&cpudata->lock);
+
+ ret = amd_pstate_set_epp(policy, epp);
return ret ? ret : count;
}
@@ -1285,7 +1242,6 @@ static ssize_t show_energy_performance_preference(
static void amd_pstate_driver_cleanup(void)
{
- amd_pstate_cppc_enable(false);
cppc_state = AMD_PSTATE_DISABLE;
current_pstate_driver = NULL;
}
@@ -1319,14 +1275,6 @@ static int amd_pstate_register_driver(int mode)
cppc_state = mode;
- ret = amd_pstate_cppc_enable(true);
- if (ret) {
- pr_err("failed to enable cppc during amd-pstate driver registration, return %d\n",
- ret);
- amd_pstate_driver_cleanup();
- return ret;
- }
-
/* at least one CPU supports CPB */
current_pstate_driver->boost_enabled = cpu_feature_enabled(X86_FEATURE_CPB);
@@ -1570,11 +1518,15 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
policy->cpuinfo.max_freq = policy->max = perf_to_freq(perf,
cpudata->nominal_freq,
perf.highest_perf);
+ policy->driver_data = cpudata;
+
+ ret = amd_pstate_cppc_enable(policy);
+ if (ret)
+ goto free_cpudata1;
/* It will be updated by governor */
policy->cur = policy->cpuinfo.min_freq;
- policy->driver_data = cpudata;
policy->boost_enabled = READ_ONCE(cpudata->boost_supported);
@@ -1667,34 +1619,28 @@ static int amd_pstate_epp_set_policy(struct cpufreq_policy *policy)
return 0;
}
-static int amd_pstate_epp_reenable(struct cpufreq_policy *policy)
+static int amd_pstate_epp_cpu_online(struct cpufreq_policy *policy)
{
struct amd_cpudata *cpudata = policy->driver_data;
union perf_cached perf = cpudata->perf;
int ret;
- ret = amd_pstate_cppc_enable(true);
+ pr_debug("AMD CPU Core %d going online\n", cpudata->cpu);
+
+ ret = amd_pstate_cppc_enable(policy);
if (ret)
- pr_err("failed to enable amd pstate during resume, return %d\n", ret);
+ return ret;
guard(mutex)(&cpudata->lock);
- return amd_pstate_update_perf(policy, 0, 0, perf.highest_perf, cpudata->epp_cached, false);
-}
-
-static int amd_pstate_epp_cpu_online(struct cpufreq_policy *policy)
-{
- struct amd_cpudata *cpudata = policy->driver_data;
- int ret;
-
- pr_debug("AMD CPU Core %d going online\n", cpudata->cpu);
-
- ret = amd_pstate_epp_reenable(policy);
+ ret = amd_pstate_update_perf(policy, 0, 0, perf.highest_perf, cpudata->epp_cached, false);
if (ret)
return ret;
+
cpudata->suspended = false;
return 0;
+
}
static int amd_pstate_epp_cpu_offline(struct cpufreq_policy *policy)
@@ -1714,20 +1660,10 @@ static int amd_pstate_epp_cpu_offline(struct cpufreq_policy *policy)
static int amd_pstate_epp_suspend(struct cpufreq_policy *policy)
{
struct amd_cpudata *cpudata = policy->driver_data;
- int ret;
-
- /* avoid suspending when EPP is not enabled */
- if (cppc_state != AMD_PSTATE_ACTIVE)
- return 0;
/* set this flag to avoid setting core offline*/
cpudata->suspended = true;
- /* disable CPPC in lowlevel firmware */
- ret = amd_pstate_cppc_enable(false);
- if (ret)
- pr_err("failed to suspend, return %d\n", ret);
-
return 0;
}
@@ -1735,12 +1671,8 @@ static int amd_pstate_epp_resume(struct cpufreq_policy *policy)
{
struct amd_cpudata *cpudata = policy->driver_data;
- if (cpudata->suspended) {
- /* enable amd pstate from suspend state*/
- amd_pstate_epp_reenable(policy);
- cpudata->suspended = false;
- }
+ cpudata->suspended = false;
return 0;
}
@@ -1752,7 +1684,6 @@ static struct cpufreq_driver amd_pstate_driver = {
.fast_switch = amd_pstate_fast_switch,
.init = amd_pstate_cpu_init,
.exit = amd_pstate_cpu_exit,
- .suspend = amd_pstate_cpu_suspend,
.resume = amd_pstate_cpu_resume,
.set_boost = amd_pstate_set_boost,
.update_limits = amd_pstate_update_limits,
@@ -1768,8 +1699,8 @@ static struct cpufreq_driver amd_pstate_epp_driver = {
.exit = amd_pstate_epp_cpu_exit,
.offline = amd_pstate_epp_cpu_offline,
.online = amd_pstate_epp_cpu_online,
- .suspend = amd_pstate_epp_suspend,
- .resume = amd_pstate_epp_resume,
+ .suspend = amd_pstate_epp_suspend,
+ .resume = amd_pstate_epp_resume,
.update_limits = amd_pstate_update_limits,
.set_boost = amd_pstate_set_boost,
.name = "amd-pstate-epp",
@@ -1920,7 +1851,6 @@ static int __init amd_pstate_init(void)
global_attr_free:
cpufreq_unregister_driver(current_pstate_driver);
- amd_pstate_cppc_enable(false);
return ret;
}
device_initcall(amd_pstate_init);
--
2.43.0
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH 14/14] cpufreq/amd-pstate: Stop caching EPP
2025-02-06 21:56 [PATCH 00/14] amd-pstate cleanups Mario Limonciello
` (12 preceding siblings ...)
2025-02-06 21:56 ` [PATCH 13/14] cpufreq/amd-pstate: Rework CPPC enabling Mario Limonciello
@ 2025-02-06 21:56 ` Mario Limonciello
2025-02-11 13:27 ` Dhananjay Ugwekar
13 siblings, 1 reply; 41+ messages in thread
From: Mario Limonciello @ 2025-02-06 21:56 UTC (permalink / raw)
To: Gautham R . Shenoy, Perry Yuan
Cc: Dhananjay Ugwekar, open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
From: Mario Limonciello <mario.limonciello@amd.com>
EPP values are cached in the cpudata structure per CPU. This is needless
though because they are also cached in the CPPC request variable.
Drop the separate cache for EPP values and always reference the CPPC
request variable when needed.
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
---
drivers/cpufreq/amd-pstate.c | 30 ++++++++++++++++--------------
drivers/cpufreq/amd-pstate.h | 1 -
2 files changed, 16 insertions(+), 15 deletions(-)
diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 697fa1b80cf24..38e5e925a7aed 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -268,8 +268,6 @@ static int msr_update_perf(struct cpufreq_policy *policy, u8 min_perf,
}
WRITE_ONCE(cpudata->cppc_req_cached, value);
- if (epp != cpudata->epp_cached)
- WRITE_ONCE(cpudata->epp_cached, epp);
return 0;
}
@@ -320,7 +318,6 @@ static int msr_set_epp(struct cpufreq_policy *policy, u8 epp)
}
/* update both so that msr_update_perf() can effectively check */
- WRITE_ONCE(cpudata->epp_cached, epp);
WRITE_ONCE(cpudata->cppc_req_cached, value);
return ret;
@@ -337,11 +334,14 @@ static int shmem_set_epp(struct cpufreq_policy *policy, u8 epp)
{
struct amd_cpudata *cpudata = policy->driver_data;
struct cppc_perf_ctrls perf_ctrls;
+ u8 epp_cached;
u64 value;
int ret;
lockdep_assert_held(&cpudata->lock);
+ epp_cached = FIELD_GET(AMD_CPPC_EPP_PERF_MASK, cpudata->cppc_req_cached);
+
if (trace_amd_pstate_epp_perf_enabled()) {
union perf_cached perf = cpudata->perf;
@@ -352,10 +352,10 @@ static int shmem_set_epp(struct cpufreq_policy *policy, u8 epp)
FIELD_GET(AMD_CPPC_MAX_PERF_MASK,
cpudata->cppc_req_cached),
policy->boost_enabled,
- epp != cpudata->epp_cached);
+ epp != epp_cached);
}
- if (epp == cpudata->epp_cached)
+ if (epp == epp_cached)
return 0;
perf_ctrls.energy_perf = epp;
@@ -364,7 +364,6 @@ static int shmem_set_epp(struct cpufreq_policy *policy, u8 epp)
pr_debug("failed to set energy perf value (%d)\n", ret);
return ret;
}
- WRITE_ONCE(cpudata->epp_cached, epp);
value = READ_ONCE(cpudata->cppc_req_cached);
value &= ~AMD_CPPC_EPP_PERF_MASK;
@@ -1218,9 +1217,11 @@ static ssize_t show_energy_performance_preference(
struct cpufreq_policy *policy, char *buf)
{
struct amd_cpudata *cpudata = policy->driver_data;
- u8 preference;
+ u8 preference, epp;
+
+ epp = FIELD_GET(AMD_CPPC_EPP_PERF_MASK, cpudata->cppc_req_cached);
- switch (cpudata->epp_cached) {
+ switch (epp) {
case AMD_CPPC_EPP_PERFORMANCE:
preference = EPP_INDEX_PERFORMANCE;
break;
@@ -1588,7 +1589,7 @@ static int amd_pstate_epp_update_limit(struct cpufreq_policy *policy)
if (policy->policy == CPUFREQ_POLICY_PERFORMANCE)
epp = 0;
else
- epp = READ_ONCE(cpudata->epp_cached);
+ epp = FIELD_GET(AMD_CPPC_EPP_PERF_MASK, cpudata->cppc_req_cached);
perf = READ_ONCE(cpudata->perf);
@@ -1624,23 +1625,24 @@ static int amd_pstate_epp_cpu_online(struct cpufreq_policy *policy)
struct amd_cpudata *cpudata = policy->driver_data;
union perf_cached perf = cpudata->perf;
int ret;
+ u8 epp;
+
+ guard(mutex)(&cpudata->lock);
+
+ epp = FIELD_GET(AMD_CPPC_EPP_PERF_MASK, cpudata->cppc_req_cached);
pr_debug("AMD CPU Core %d going online\n", cpudata->cpu);
ret = amd_pstate_cppc_enable(policy);
if (ret)
return ret;
-
- guard(mutex)(&cpudata->lock);
-
- ret = amd_pstate_update_perf(policy, 0, 0, perf.highest_perf, cpudata->epp_cached, false);
+ ret = amd_pstate_update_perf(policy, 0, 0, perf.highest_perf, epp, false);
if (ret)
return ret;
cpudata->suspended = false;
return 0;
-
}
static int amd_pstate_epp_cpu_offline(struct cpufreq_policy *policy)
diff --git a/drivers/cpufreq/amd-pstate.h b/drivers/cpufreq/amd-pstate.h
index 16ce631a6c3d5..c95b4081a3ff6 100644
--- a/drivers/cpufreq/amd-pstate.h
+++ b/drivers/cpufreq/amd-pstate.h
@@ -100,7 +100,6 @@ struct amd_cpudata {
struct cpufreq_policy *policy;
/* EPP feature related attributes*/
- u8 epp_cached;
bool suspended;
u8 epp_default;
};
--
2.43.0
^ permalink raw reply related [flat|nested] 41+ messages in thread
* Re: [PATCH 02/14] cpufreq/amd-pstate: Drop min and max cached frequencies
2025-02-06 21:56 ` [PATCH 02/14] cpufreq/amd-pstate: Drop min and max cached frequencies Mario Limonciello
@ 2025-02-07 10:44 ` Dhananjay Ugwekar
2025-02-07 16:15 ` Mario Limonciello
0 siblings, 1 reply; 41+ messages in thread
From: Dhananjay Ugwekar @ 2025-02-07 10:44 UTC (permalink / raw)
To: Mario Limonciello, Gautham R . Shenoy, Perry Yuan
Cc: open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
Hello Mario,
On 2/7/2025 3:26 AM, Mario Limonciello wrote:
> From: Mario Limonciello <mario.limonciello@amd.com>
>
> Use the perf_to_freq helpers to calculate this on the fly.
I think there is a benefit to having the min/max_limit_freq values cached.
These values help us avoid unnecessary amd_pstate_update_min_max_limit() calls in
majority of cases (where the policy->min/max values didnt change).
For the cpudata->min/max_freq values, I think there is little value in caching them,
i.e. only used in amd_pstate_cpu_boost_update() and show_amd_pstate_max_freq(), which
are not supposed to be called very frequently.
So, I propose we keep the cpudata->min/max_limit_freq variables and remove the
cpudata->min/max_freq ones. Thoughts?
Thanks,
Dhananjay
>
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> ---
> drivers/cpufreq/amd-pstate-ut.c | 14 +++----
> drivers/cpufreq/amd-pstate.c | 74 ++++++++++-----------------------
> drivers/cpufreq/amd-pstate.h | 8 ----
> 3 files changed, 29 insertions(+), 67 deletions(-)
>
> diff --git a/drivers/cpufreq/amd-pstate-ut.c b/drivers/cpufreq/amd-pstate-ut.c
> index 3a0a380c3590c..445278cf40b61 100644
> --- a/drivers/cpufreq/amd-pstate-ut.c
> +++ b/drivers/cpufreq/amd-pstate-ut.c
> @@ -214,14 +214,14 @@ static void amd_pstate_ut_check_freq(u32 index)
> break;
> cpudata = policy->driver_data;
>
> - if (!((cpudata->max_freq >= cpudata->nominal_freq) &&
> + if (!((policy->cpuinfo.max_freq >= cpudata->nominal_freq) &&
> (cpudata->nominal_freq > cpudata->lowest_nonlinear_freq) &&
> - (cpudata->lowest_nonlinear_freq > cpudata->min_freq) &&
> - (cpudata->min_freq > 0))) {
> + (cpudata->lowest_nonlinear_freq > policy->cpuinfo.min_freq) &&
> + (policy->cpuinfo.min_freq > 0))) {
> amd_pstate_ut_cases[index].result = AMD_PSTATE_UT_RESULT_FAIL;
> pr_err("%s cpu%d max=%d >= nominal=%d > lowest_nonlinear=%d > min=%d > 0, the formula is incorrect!\n",
> - __func__, cpu, cpudata->max_freq, cpudata->nominal_freq,
> - cpudata->lowest_nonlinear_freq, cpudata->min_freq);
> + __func__, cpu, policy->cpuinfo.max_freq, cpudata->nominal_freq,
> + cpudata->lowest_nonlinear_freq, policy->cpuinfo.min_freq);
> goto skip_test;
> }
>
> @@ -233,13 +233,13 @@ static void amd_pstate_ut_check_freq(u32 index)
> }
>
> if (cpudata->boost_supported) {
> - if ((policy->max == cpudata->max_freq) ||
> + if ((policy->max == policy->cpuinfo.max_freq) ||
> (policy->max == cpudata->nominal_freq))
> amd_pstate_ut_cases[index].result = AMD_PSTATE_UT_RESULT_PASS;
> else {
> amd_pstate_ut_cases[index].result = AMD_PSTATE_UT_RESULT_FAIL;
> pr_err("%s cpu%d policy_max=%d should be equal cpu_max=%d or cpu_nominal=%d !\n",
> - __func__, cpu, policy->max, cpudata->max_freq,
> + __func__, cpu, policy->max, policy->cpuinfo.max_freq,
> cpudata->nominal_freq);
> goto skip_test;
> }
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index 573643654e8d6..668377f55b630 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -615,8 +615,6 @@ static int amd_pstate_update_min_max_limit(struct cpufreq_policy *policy)
>
> WRITE_ONCE(cpudata->max_limit_perf, max_limit_perf);
> WRITE_ONCE(cpudata->min_limit_perf, min_limit_perf);
> - WRITE_ONCE(cpudata->max_limit_freq, policy->max);
> - WRITE_ONCE(cpudata->min_limit_freq, policy->min);
>
> return 0;
> }
> @@ -628,8 +626,7 @@ static int amd_pstate_update_freq(struct cpufreq_policy *policy,
> struct amd_cpudata *cpudata = policy->driver_data;
> u8 des_perf;
>
> - if (policy->min != cpudata->min_limit_freq || policy->max != cpudata->max_limit_freq)
> - amd_pstate_update_min_max_limit(policy);
> + amd_pstate_update_min_max_limit(policy);
>
> freqs.old = policy->cur;
> freqs.new = target_freq;
> @@ -684,8 +681,7 @@ static void amd_pstate_adjust_perf(unsigned int cpu,
>
> cpudata = policy->driver_data;
>
> - if (policy->min != cpudata->min_limit_freq || policy->max != cpudata->max_limit_freq)
> - amd_pstate_update_min_max_limit(policy);
> + amd_pstate_update_min_max_limit(policy);
>
> cap_perf = READ_ONCE(cpudata->highest_perf);
> min_limit_perf = READ_ONCE(cpudata->min_limit_perf);
> @@ -717,7 +713,7 @@ static int amd_pstate_cpu_boost_update(struct cpufreq_policy *policy, bool on)
> int ret = 0;
>
> nominal_freq = READ_ONCE(cpudata->nominal_freq);
> - max_freq = READ_ONCE(cpudata->max_freq);
> + max_freq = perf_to_freq(cpudata, READ_ONCE(cpudata->highest_perf));
>
> if (on)
> policy->cpuinfo.max_freq = max_freq;
> @@ -901,35 +897,25 @@ static u32 amd_pstate_get_transition_latency(unsigned int cpu)
> static int amd_pstate_init_freq(struct amd_cpudata *cpudata)
> {
> int ret;
> - u32 min_freq, max_freq;
> - u32 nominal_freq, lowest_nonlinear_freq;
> + u32 min_freq, nominal_freq, lowest_nonlinear_freq;
> struct cppc_perf_caps cppc_perf;
>
> ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
> if (ret)
> return ret;
>
> - if (quirks && quirks->lowest_freq)
> - min_freq = quirks->lowest_freq;
> - else
> - min_freq = cppc_perf.lowest_freq;
> -
> if (quirks && quirks->nominal_freq)
> nominal_freq = quirks->nominal_freq;
> else
> nominal_freq = cppc_perf.nominal_freq;
>
> - min_freq *= 1000;
> nominal_freq *= 1000;
> -
> WRITE_ONCE(cpudata->nominal_freq, nominal_freq);
> - WRITE_ONCE(cpudata->min_freq, min_freq);
>
> - max_freq = perf_to_freq(cpudata, cpudata->highest_perf);
> - lowest_nonlinear_freq = perf_to_freq(cpudata, cpudata->lowest_nonlinear_perf);
> -
> - WRITE_ONCE(cpudata->lowest_nonlinear_freq, lowest_nonlinear_freq);
> - WRITE_ONCE(cpudata->max_freq, max_freq);
> + if (quirks && quirks->lowest_freq) {
> + min_freq = quirks->lowest_freq;
> + } else
> + min_freq = cppc_perf.lowest_freq;
>
> /**
> * Below values need to be initialized correctly, otherwise driver will fail to load
> @@ -937,12 +923,15 @@ static int amd_pstate_init_freq(struct amd_cpudata *cpudata)
> * lowest_nonlinear_freq is a value between [min_freq, nominal_freq]
> * Check _CPC in ACPI table objects if any values are incorrect
> */
> - if (min_freq <= 0 || max_freq <= 0 || nominal_freq <= 0 || min_freq > max_freq) {
> - pr_err("min_freq(%d) or max_freq(%d) or nominal_freq(%d) value is incorrect\n",
> - min_freq, max_freq, nominal_freq);
> + if (nominal_freq <= 0) {
> + pr_err("nominal_freq(%d) value is incorrect\n",
> + nominal_freq);
> return -EINVAL;
> }
>
> + lowest_nonlinear_freq = perf_to_freq(cpudata, cpudata->lowest_nonlinear_perf);
> + WRITE_ONCE(cpudata->lowest_nonlinear_freq, lowest_nonlinear_freq);
> +
> if (lowest_nonlinear_freq <= min_freq || lowest_nonlinear_freq > nominal_freq) {
> pr_err("lowest_nonlinear_freq(%d) value is out of range [min_freq(%d), nominal_freq(%d)]\n",
> lowest_nonlinear_freq, min_freq, nominal_freq);
> @@ -954,9 +943,9 @@ static int amd_pstate_init_freq(struct amd_cpudata *cpudata)
>
> static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
> {
> - int min_freq, max_freq, ret;
> - struct device *dev;
> struct amd_cpudata *cpudata;
> + struct device *dev;
> + int ret;
>
> /*
> * Resetting PERF_CTL_MSR will put the CPU in P0 frequency,
> @@ -987,17 +976,11 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
> if (ret)
> goto free_cpudata1;
>
> - min_freq = READ_ONCE(cpudata->min_freq);
> - max_freq = READ_ONCE(cpudata->max_freq);
> -
> policy->cpuinfo.transition_latency = amd_pstate_get_transition_latency(policy->cpu);
> policy->transition_delay_us = amd_pstate_get_transition_delay_us(policy->cpu);
>
> - policy->min = min_freq;
> - policy->max = max_freq;
> -
> - policy->cpuinfo.min_freq = min_freq;
> - policy->cpuinfo.max_freq = max_freq;
> + policy->cpuinfo.min_freq = policy->min = perf_to_freq(cpudata, cpudata->lowest_perf);
> + policy->cpuinfo.max_freq = policy->max = perf_to_freq(cpudata, cpudata->highest_perf);
>
> policy->boost_enabled = READ_ONCE(cpudata->boost_supported);
>
> @@ -1021,9 +1004,6 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
> goto free_cpudata2;
> }
>
> - cpudata->max_limit_freq = max_freq;
> - cpudata->min_limit_freq = min_freq;
> -
> policy->driver_data = cpudata;
>
> if (!current_pstate_driver->adjust_perf)
> @@ -1081,14 +1061,10 @@ static int amd_pstate_cpu_suspend(struct cpufreq_policy *policy)
> static ssize_t show_amd_pstate_max_freq(struct cpufreq_policy *policy,
> char *buf)
> {
> - int max_freq;
> struct amd_cpudata *cpudata = policy->driver_data;
>
> - max_freq = READ_ONCE(cpudata->max_freq);
> - if (max_freq < 0)
> - return max_freq;
>
> - return sysfs_emit(buf, "%u\n", max_freq);
> + return sysfs_emit(buf, "%u\n", perf_to_freq(cpudata, READ_ONCE(cpudata->highest_perf)));
> }
>
> static ssize_t show_amd_pstate_lowest_nonlinear_freq(struct cpufreq_policy *policy,
> @@ -1446,10 +1422,10 @@ static bool amd_pstate_acpi_pm_profile_undefined(void)
>
> static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
> {
> - int min_freq, max_freq, ret;
> struct amd_cpudata *cpudata;
> struct device *dev;
> u64 value;
> + int ret;
>
> /*
> * Resetting PERF_CTL_MSR will put the CPU in P0 frequency,
> @@ -1480,19 +1456,13 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
> if (ret)
> goto free_cpudata1;
>
> - min_freq = READ_ONCE(cpudata->min_freq);
> - max_freq = READ_ONCE(cpudata->max_freq);
> -
> - policy->cpuinfo.min_freq = min_freq;
> - policy->cpuinfo.max_freq = max_freq;
> + policy->cpuinfo.min_freq = policy->min = perf_to_freq(cpudata, cpudata->lowest_perf);
> + policy->cpuinfo.max_freq = policy->max = perf_to_freq(cpudata, cpudata->highest_perf);
> /* It will be updated by governor */
> policy->cur = policy->cpuinfo.min_freq;
>
> policy->driver_data = cpudata;
>
> - policy->min = policy->cpuinfo.min_freq;
> - policy->max = policy->cpuinfo.max_freq;
> -
> policy->boost_enabled = READ_ONCE(cpudata->boost_supported);
>
> /*
> diff --git a/drivers/cpufreq/amd-pstate.h b/drivers/cpufreq/amd-pstate.h
> index 19d405c6d805e..472044a1de43b 100644
> --- a/drivers/cpufreq/amd-pstate.h
> +++ b/drivers/cpufreq/amd-pstate.h
> @@ -44,10 +44,6 @@ struct amd_aperf_mperf {
> * priority.
> * @min_limit_perf: Cached value of the performance corresponding to policy->min
> * @max_limit_perf: Cached value of the performance corresponding to policy->max
> - * @min_limit_freq: Cached value of policy->min (in khz)
> - * @max_limit_freq: Cached value of policy->max (in khz)
> - * @max_freq: the frequency (in khz) that mapped to highest_perf
> - * @min_freq: the frequency (in khz) that mapped to lowest_perf
> * @nominal_freq: the frequency (in khz) that mapped to nominal_perf
> * @lowest_nonlinear_freq: the frequency (in khz) that mapped to lowest_nonlinear_perf
> * @cur: Difference of Aperf/Mperf/tsc count between last and current sample
> @@ -77,11 +73,7 @@ struct amd_cpudata {
> u8 prefcore_ranking;
> u8 min_limit_perf;
> u8 max_limit_perf;
> - u32 min_limit_freq;
> - u32 max_limit_freq;
>
> - u32 max_freq;
> - u32 min_freq;
> u32 nominal_freq;
> u32 lowest_nonlinear_freq;
>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 02/14] cpufreq/amd-pstate: Drop min and max cached frequencies
2025-02-07 10:44 ` Dhananjay Ugwekar
@ 2025-02-07 16:15 ` Mario Limonciello
0 siblings, 0 replies; 41+ messages in thread
From: Mario Limonciello @ 2025-02-07 16:15 UTC (permalink / raw)
To: Dhananjay Ugwekar, Gautham R . Shenoy, Perry Yuan
Cc: open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
On 2/7/2025 04:44, Dhananjay Ugwekar wrote:
> Hello Mario,
>
> On 2/7/2025 3:26 AM, Mario Limonciello wrote:
>> From: Mario Limonciello <mario.limonciello@amd.com>
>>
>> Use the perf_to_freq helpers to calculate this on the fly.
>
> I think there is a benefit to having the min/max_limit_freq values cached.
> These values help us avoid unnecessary amd_pstate_update_min_max_limit() calls in
> majority of cases (where the policy->min/max values didnt change).
>
> For the cpudata->min/max_freq values, I think there is little value in caching them,
> i.e. only used in amd_pstate_cpu_boost_update() and show_amd_pstate_max_freq(), which
> are not supposed to be called very frequently.
>
> So, I propose we keep the cpudata->min/max_limit_freq variables and remove the
> cpudata->min/max_freq ones. Thoughts?
Yeah I can see this argument making sense and worthwhile keeping the
caching. I'll modify it for the next version.
>
> Thanks,
> Dhananjay
>
>>
>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>> ---
>> drivers/cpufreq/amd-pstate-ut.c | 14 +++----
>> drivers/cpufreq/amd-pstate.c | 74 ++++++++++-----------------------
>> drivers/cpufreq/amd-pstate.h | 8 ----
>> 3 files changed, 29 insertions(+), 67 deletions(-)
>>
>> diff --git a/drivers/cpufreq/amd-pstate-ut.c b/drivers/cpufreq/amd-pstate-ut.c
>> index 3a0a380c3590c..445278cf40b61 100644
>> --- a/drivers/cpufreq/amd-pstate-ut.c
>> +++ b/drivers/cpufreq/amd-pstate-ut.c
>> @@ -214,14 +214,14 @@ static void amd_pstate_ut_check_freq(u32 index)
>> break;
>> cpudata = policy->driver_data;
>>
>> - if (!((cpudata->max_freq >= cpudata->nominal_freq) &&
>> + if (!((policy->cpuinfo.max_freq >= cpudata->nominal_freq) &&
>> (cpudata->nominal_freq > cpudata->lowest_nonlinear_freq) &&
>> - (cpudata->lowest_nonlinear_freq > cpudata->min_freq) &&
>> - (cpudata->min_freq > 0))) {
>> + (cpudata->lowest_nonlinear_freq > policy->cpuinfo.min_freq) &&
>> + (policy->cpuinfo.min_freq > 0))) {
>> amd_pstate_ut_cases[index].result = AMD_PSTATE_UT_RESULT_FAIL;
>> pr_err("%s cpu%d max=%d >= nominal=%d > lowest_nonlinear=%d > min=%d > 0, the formula is incorrect!\n",
>> - __func__, cpu, cpudata->max_freq, cpudata->nominal_freq,
>> - cpudata->lowest_nonlinear_freq, cpudata->min_freq);
>> + __func__, cpu, policy->cpuinfo.max_freq, cpudata->nominal_freq,
>> + cpudata->lowest_nonlinear_freq, policy->cpuinfo.min_freq);
>> goto skip_test;
>> }
>>
>> @@ -233,13 +233,13 @@ static void amd_pstate_ut_check_freq(u32 index)
>> }
>>
>> if (cpudata->boost_supported) {
>> - if ((policy->max == cpudata->max_freq) ||
>> + if ((policy->max == policy->cpuinfo.max_freq) ||
>> (policy->max == cpudata->nominal_freq))
>> amd_pstate_ut_cases[index].result = AMD_PSTATE_UT_RESULT_PASS;
>> else {
>> amd_pstate_ut_cases[index].result = AMD_PSTATE_UT_RESULT_FAIL;
>> pr_err("%s cpu%d policy_max=%d should be equal cpu_max=%d or cpu_nominal=%d !\n",
>> - __func__, cpu, policy->max, cpudata->max_freq,
>> + __func__, cpu, policy->max, policy->cpuinfo.max_freq,
>> cpudata->nominal_freq);
>> goto skip_test;
>> }
>> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
>> index 573643654e8d6..668377f55b630 100644
>> --- a/drivers/cpufreq/amd-pstate.c
>> +++ b/drivers/cpufreq/amd-pstate.c
>> @@ -615,8 +615,6 @@ static int amd_pstate_update_min_max_limit(struct cpufreq_policy *policy)
>>
>> WRITE_ONCE(cpudata->max_limit_perf, max_limit_perf);
>> WRITE_ONCE(cpudata->min_limit_perf, min_limit_perf);
>> - WRITE_ONCE(cpudata->max_limit_freq, policy->max);
>> - WRITE_ONCE(cpudata->min_limit_freq, policy->min);
>>
>> return 0;
>> }
>> @@ -628,8 +626,7 @@ static int amd_pstate_update_freq(struct cpufreq_policy *policy,
>> struct amd_cpudata *cpudata = policy->driver_data;
>> u8 des_perf;
>>
>> - if (policy->min != cpudata->min_limit_freq || policy->max != cpudata->max_limit_freq)
>> - amd_pstate_update_min_max_limit(policy);
>> + amd_pstate_update_min_max_limit(policy);
>>
>> freqs.old = policy->cur;
>> freqs.new = target_freq;
>> @@ -684,8 +681,7 @@ static void amd_pstate_adjust_perf(unsigned int cpu,
>>
>> cpudata = policy->driver_data;
>>
>> - if (policy->min != cpudata->min_limit_freq || policy->max != cpudata->max_limit_freq)
>> - amd_pstate_update_min_max_limit(policy);
>> + amd_pstate_update_min_max_limit(policy);
>>
>> cap_perf = READ_ONCE(cpudata->highest_perf);
>> min_limit_perf = READ_ONCE(cpudata->min_limit_perf);
>> @@ -717,7 +713,7 @@ static int amd_pstate_cpu_boost_update(struct cpufreq_policy *policy, bool on)
>> int ret = 0;
>>
>> nominal_freq = READ_ONCE(cpudata->nominal_freq);
>> - max_freq = READ_ONCE(cpudata->max_freq);
>> + max_freq = perf_to_freq(cpudata, READ_ONCE(cpudata->highest_perf));
>>
>> if (on)
>> policy->cpuinfo.max_freq = max_freq;
>> @@ -901,35 +897,25 @@ static u32 amd_pstate_get_transition_latency(unsigned int cpu)
>> static int amd_pstate_init_freq(struct amd_cpudata *cpudata)
>> {
>> int ret;
>> - u32 min_freq, max_freq;
>> - u32 nominal_freq, lowest_nonlinear_freq;
>> + u32 min_freq, nominal_freq, lowest_nonlinear_freq;
>> struct cppc_perf_caps cppc_perf;
>>
>> ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
>> if (ret)
>> return ret;
>>
>> - if (quirks && quirks->lowest_freq)
>> - min_freq = quirks->lowest_freq;
>> - else
>> - min_freq = cppc_perf.lowest_freq;
>> -
>> if (quirks && quirks->nominal_freq)
>> nominal_freq = quirks->nominal_freq;
>> else
>> nominal_freq = cppc_perf.nominal_freq;
>>
>> - min_freq *= 1000;
>> nominal_freq *= 1000;
>> -
>> WRITE_ONCE(cpudata->nominal_freq, nominal_freq);
>> - WRITE_ONCE(cpudata->min_freq, min_freq);
>>
>> - max_freq = perf_to_freq(cpudata, cpudata->highest_perf);
>> - lowest_nonlinear_freq = perf_to_freq(cpudata, cpudata->lowest_nonlinear_perf);
>> -
>> - WRITE_ONCE(cpudata->lowest_nonlinear_freq, lowest_nonlinear_freq);
>> - WRITE_ONCE(cpudata->max_freq, max_freq);
>> + if (quirks && quirks->lowest_freq) {
>> + min_freq = quirks->lowest_freq;
>> + } else
>> + min_freq = cppc_perf.lowest_freq;
>>
>> /**
>> * Below values need to be initialized correctly, otherwise driver will fail to load
>> @@ -937,12 +923,15 @@ static int amd_pstate_init_freq(struct amd_cpudata *cpudata)
>> * lowest_nonlinear_freq is a value between [min_freq, nominal_freq]
>> * Check _CPC in ACPI table objects if any values are incorrect
>> */
>> - if (min_freq <= 0 || max_freq <= 0 || nominal_freq <= 0 || min_freq > max_freq) {
>> - pr_err("min_freq(%d) or max_freq(%d) or nominal_freq(%d) value is incorrect\n",
>> - min_freq, max_freq, nominal_freq);
>> + if (nominal_freq <= 0) {
>> + pr_err("nominal_freq(%d) value is incorrect\n",
>> + nominal_freq);
>> return -EINVAL;
>> }
>>
>> + lowest_nonlinear_freq = perf_to_freq(cpudata, cpudata->lowest_nonlinear_perf);
>> + WRITE_ONCE(cpudata->lowest_nonlinear_freq, lowest_nonlinear_freq);
>> +
>> if (lowest_nonlinear_freq <= min_freq || lowest_nonlinear_freq > nominal_freq) {
>> pr_err("lowest_nonlinear_freq(%d) value is out of range [min_freq(%d), nominal_freq(%d)]\n",
>> lowest_nonlinear_freq, min_freq, nominal_freq);
>> @@ -954,9 +943,9 @@ static int amd_pstate_init_freq(struct amd_cpudata *cpudata)
>>
>> static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
>> {
>> - int min_freq, max_freq, ret;
>> - struct device *dev;
>> struct amd_cpudata *cpudata;
>> + struct device *dev;
>> + int ret;
>>
>> /*
>> * Resetting PERF_CTL_MSR will put the CPU in P0 frequency,
>> @@ -987,17 +976,11 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
>> if (ret)
>> goto free_cpudata1;
>>
>> - min_freq = READ_ONCE(cpudata->min_freq);
>> - max_freq = READ_ONCE(cpudata->max_freq);
>> -
>> policy->cpuinfo.transition_latency = amd_pstate_get_transition_latency(policy->cpu);
>> policy->transition_delay_us = amd_pstate_get_transition_delay_us(policy->cpu);
>>
>> - policy->min = min_freq;
>> - policy->max = max_freq;
>> -
>> - policy->cpuinfo.min_freq = min_freq;
>> - policy->cpuinfo.max_freq = max_freq;
>> + policy->cpuinfo.min_freq = policy->min = perf_to_freq(cpudata, cpudata->lowest_perf);
>> + policy->cpuinfo.max_freq = policy->max = perf_to_freq(cpudata, cpudata->highest_perf);
>>
>> policy->boost_enabled = READ_ONCE(cpudata->boost_supported);
>>
>> @@ -1021,9 +1004,6 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
>> goto free_cpudata2;
>> }
>>
>> - cpudata->max_limit_freq = max_freq;
>> - cpudata->min_limit_freq = min_freq;
>> -
>> policy->driver_data = cpudata;
>>
>> if (!current_pstate_driver->adjust_perf)
>> @@ -1081,14 +1061,10 @@ static int amd_pstate_cpu_suspend(struct cpufreq_policy *policy)
>> static ssize_t show_amd_pstate_max_freq(struct cpufreq_policy *policy,
>> char *buf)
>> {
>> - int max_freq;
>> struct amd_cpudata *cpudata = policy->driver_data;
>>
>> - max_freq = READ_ONCE(cpudata->max_freq);
>> - if (max_freq < 0)
>> - return max_freq;
>>
>> - return sysfs_emit(buf, "%u\n", max_freq);
>> + return sysfs_emit(buf, "%u\n", perf_to_freq(cpudata, READ_ONCE(cpudata->highest_perf)));
>> }
>>
>> static ssize_t show_amd_pstate_lowest_nonlinear_freq(struct cpufreq_policy *policy,
>> @@ -1446,10 +1422,10 @@ static bool amd_pstate_acpi_pm_profile_undefined(void)
>>
>> static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
>> {
>> - int min_freq, max_freq, ret;
>> struct amd_cpudata *cpudata;
>> struct device *dev;
>> u64 value;
>> + int ret;
>>
>> /*
>> * Resetting PERF_CTL_MSR will put the CPU in P0 frequency,
>> @@ -1480,19 +1456,13 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
>> if (ret)
>> goto free_cpudata1;
>>
>> - min_freq = READ_ONCE(cpudata->min_freq);
>> - max_freq = READ_ONCE(cpudata->max_freq);
>> -
>> - policy->cpuinfo.min_freq = min_freq;
>> - policy->cpuinfo.max_freq = max_freq;
>> + policy->cpuinfo.min_freq = policy->min = perf_to_freq(cpudata, cpudata->lowest_perf);
>> + policy->cpuinfo.max_freq = policy->max = perf_to_freq(cpudata, cpudata->highest_perf);
>> /* It will be updated by governor */
>> policy->cur = policy->cpuinfo.min_freq;
>>
>> policy->driver_data = cpudata;
>>
>> - policy->min = policy->cpuinfo.min_freq;
>> - policy->max = policy->cpuinfo.max_freq;
>> -
>> policy->boost_enabled = READ_ONCE(cpudata->boost_supported);
>>
>> /*
>> diff --git a/drivers/cpufreq/amd-pstate.h b/drivers/cpufreq/amd-pstate.h
>> index 19d405c6d805e..472044a1de43b 100644
>> --- a/drivers/cpufreq/amd-pstate.h
>> +++ b/drivers/cpufreq/amd-pstate.h
>> @@ -44,10 +44,6 @@ struct amd_aperf_mperf {
>> * priority.
>> * @min_limit_perf: Cached value of the performance corresponding to policy->min
>> * @max_limit_perf: Cached value of the performance corresponding to policy->max
>> - * @min_limit_freq: Cached value of policy->min (in khz)
>> - * @max_limit_freq: Cached value of policy->max (in khz)
>> - * @max_freq: the frequency (in khz) that mapped to highest_perf
>> - * @min_freq: the frequency (in khz) that mapped to lowest_perf
>> * @nominal_freq: the frequency (in khz) that mapped to nominal_perf
>> * @lowest_nonlinear_freq: the frequency (in khz) that mapped to lowest_nonlinear_perf
>> * @cur: Difference of Aperf/Mperf/tsc count between last and current sample
>> @@ -77,11 +73,7 @@ struct amd_cpudata {
>> u8 prefcore_ranking;
>> u8 min_limit_perf;
>> u8 max_limit_perf;
>> - u32 min_limit_freq;
>> - u32 max_limit_freq;
>>
>> - u32 max_freq;
>> - u32 min_freq;
>> u32 nominal_freq;
>> u32 lowest_nonlinear_freq;
>>
>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 01/14] cpufreq/amd-pstate: Show a warning when a CPU fails to setup
2025-02-06 21:56 ` [PATCH 01/14] cpufreq/amd-pstate: Show a warning when a CPU fails to setup Mario Limonciello
@ 2025-02-10 11:59 ` Dhananjay Ugwekar
2025-02-10 13:50 ` Gautham R. Shenoy
0 siblings, 1 reply; 41+ messages in thread
From: Dhananjay Ugwekar @ 2025-02-10 11:59 UTC (permalink / raw)
To: Mario Limonciello, Gautham R . Shenoy, Perry Yuan
Cc: open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
On 2/7/2025 3:26 AM, Mario Limonciello wrote:
> From: Mario Limonciello <mario.limonciello@amd.com>
>
> I came across a system that MSR_AMD_CPPC_CAP1 for some CPUs isn't
> populated. This is an unexpected behavior that is most likely a
> BIOS bug. In the event it happens I'd like users to report bugs
> to properly root cause and get this fixed.
I'm okay with this patch, but I see a similar pr_debug in caller cpufreq_online(),
so not sure if this is strictly necessary.
1402 /*
1403 * Call driver. From then on the cpufreq must be able
1404 * to accept all calls to ->verify and ->setpolicy for this CPU.
1405 */
1406 ret = cpufreq_driver->init(policy);
1407 if (ret) {
1408 pr_debug("%s: %d: initialization failed\n", __func__,
1409 __LINE__);
1410 goto out_free_policy;
1411
Thanks,
Dhananjay
>
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> ---
> drivers/cpufreq/amd-pstate.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index f425fb7ec77d7..573643654e8d6 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -1034,6 +1034,7 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
> free_cpudata2:
> freq_qos_remove_request(&cpudata->req[0]);
> free_cpudata1:
> + pr_warn("Failed to initialize CPU %d: %d\n", policy->cpu, ret);
> kfree(cpudata);
> return ret;
> }
> @@ -1527,6 +1528,7 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
> return 0;
>
> free_cpudata1:
> + pr_warn("Failed to initialize CPU %d: %d\n", policy->cpu, ret);
> kfree(cpudata);
> return ret;
> }
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 03/14] cpufreq/amd-pstate: Move perf values into a union
2025-02-06 21:56 ` [PATCH 03/14] cpufreq/amd-pstate: Move perf values into a union Mario Limonciello
@ 2025-02-10 13:38 ` Dhananjay Ugwekar
2025-02-11 22:14 ` Mario Limonciello
0 siblings, 1 reply; 41+ messages in thread
From: Dhananjay Ugwekar @ 2025-02-10 13:38 UTC (permalink / raw)
To: Mario Limonciello, Gautham R . Shenoy, Perry Yuan
Cc: open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
On 2/7/2025 3:26 AM, Mario Limonciello wrote:
> From: Mario Limonciello <mario.limonciello@amd.com>
>
> By storing perf values in a union all the writes and reads can
> be done atomically, removing the need for some concurrency protections.
>
> While making this change, also drop the cached frequency values,
> using inline helpers to calculate them on demand from perf value.
>
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> ---
> drivers/cpufreq/amd-pstate-ut.c | 17 +--
> drivers/cpufreq/amd-pstate.c | 212 +++++++++++++++++++-------------
> drivers/cpufreq/amd-pstate.h | 48 +++++---
> 3 files changed, 163 insertions(+), 114 deletions(-)
>
> diff --git a/drivers/cpufreq/amd-pstate-ut.c b/drivers/cpufreq/amd-pstate-ut.c
> index 445278cf40b61..d9ab98c6f56b1 100644
> --- a/drivers/cpufreq/amd-pstate-ut.c
> +++ b/drivers/cpufreq/amd-pstate-ut.c
> @@ -162,19 +162,20 @@ static void amd_pstate_ut_check_perf(u32 index)
> lowest_perf = AMD_CPPC_LOWEST_PERF(cap1);
> }
>
> - if (highest_perf != READ_ONCE(cpudata->highest_perf) && !cpudata->hw_prefcore) {
> + if (highest_perf != READ_ONCE(cpudata->perf.highest_perf) &&
> + !cpudata->hw_prefcore) {
> pr_err("%s cpu%d highest=%d %d highest perf doesn't match\n",
> - __func__, cpu, highest_perf, cpudata->highest_perf);
> + __func__, cpu, highest_perf, cpudata->perf.highest_perf);
> goto skip_test;
> }
> - if ((nominal_perf != READ_ONCE(cpudata->nominal_perf)) ||
> - (lowest_nonlinear_perf != READ_ONCE(cpudata->lowest_nonlinear_perf)) ||
> - (lowest_perf != READ_ONCE(cpudata->lowest_perf))) {
> + if ((nominal_perf != READ_ONCE(cpudata->perf.nominal_perf)) ||
> + (lowest_nonlinear_perf != READ_ONCE(cpudata->perf.lowest_nonlinear_perf)) ||
> + (lowest_perf != READ_ONCE(cpudata->perf.lowest_perf))) {
How about making a local copy of cpudata->perf and using that, instead of dereferencing the
cpudata pointer multiple times, something like,
union perf_cached cur_perf = READ_ONCE(cpudata->perf);
if ((nominal_perf != cur_perf.nominal_perf) ||
(lowest_nonlinear_perf != cur_perf.lowest_nonlinear_perf)) ||
(lowest_perf != cur_perf.lowest_perf)) {
> amd_pstate_ut_cases[index].result = AMD_PSTATE_UT_RESULT_FAIL;
> pr_err("%s cpu%d nominal=%d %d lowest_nonlinear=%d %d lowest=%d %d, they should be equal!\n",
> - __func__, cpu, nominal_perf, cpudata->nominal_perf,
> - lowest_nonlinear_perf, cpudata->lowest_nonlinear_perf,
> - lowest_perf, cpudata->lowest_perf);
> + __func__, cpu, nominal_perf, cpudata->perf.nominal_perf,
> + lowest_nonlinear_perf, cpudata->perf.lowest_nonlinear_perf,
> + lowest_perf, cpudata->perf.lowest_perf);
> goto skip_test;
> }
>
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index 668377f55b630..77bc6418731ee 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -142,18 +142,17 @@ static struct quirk_entry quirk_amd_7k62 = {
> .lowest_freq = 550,
> };
>
> -static inline u8 freq_to_perf(struct amd_cpudata *cpudata, unsigned int freq_val)
> +static inline u8 freq_to_perf(union perf_cached perf, u32 nominal_freq, unsigned int freq_val)
> {
> - u8 perf_val = DIV_ROUND_UP_ULL((u64)freq_val * cpudata->nominal_perf,
> - cpudata->nominal_freq);
> + u8 perf_val = DIV_ROUND_UP_ULL((u64)freq_val * perf.nominal_perf, nominal_freq);
>
> - return clamp_t(u8, perf_val, cpudata->lowest_perf, cpudata->highest_perf);
> + return clamp_t(u8, perf_val, perf.lowest_perf, perf.highest_perf);
> }
>
> -static inline u32 perf_to_freq(struct amd_cpudata *cpudata, u8 perf_val)
> +static inline u32 perf_to_freq(union perf_cached perf, u32 nominal_freq, u8 perf_val)
> {
> - return DIV_ROUND_UP_ULL((u64)cpudata->nominal_freq * perf_val,
> - cpudata->nominal_perf);
> + return DIV_ROUND_UP_ULL((u64)nominal_freq * perf_val,
> + perf.nominal_perf);
> }
>
> static int __init dmi_matched_7k62_bios_bug(const struct dmi_system_id *dmi)
> @@ -347,7 +346,9 @@ static int amd_pstate_set_energy_pref_index(struct cpufreq_policy *policy,
> }
>
> if (trace_amd_pstate_epp_perf_enabled()) {
> - trace_amd_pstate_epp_perf(cpudata->cpu, cpudata->highest_perf,
> + union perf_cached perf = cpudata->perf;
> +
> + trace_amd_pstate_epp_perf(cpudata->cpu, perf.highest_perf,
> epp,
> FIELD_GET(AMD_CPPC_MIN_PERF_MASK, cpudata->cppc_req_cached),
> FIELD_GET(AMD_CPPC_MAX_PERF_MASK, cpudata->cppc_req_cached),
> @@ -425,6 +426,7 @@ static inline int amd_pstate_cppc_enable(bool enable)
>
> static int msr_init_perf(struct amd_cpudata *cpudata)
> {
> + union perf_cached perf = cpudata->perf;
> u64 cap1, numerator;
>
> int ret = rdmsrl_safe_on_cpu(cpudata->cpu, MSR_AMD_CPPC_CAP1,
> @@ -436,19 +438,21 @@ static int msr_init_perf(struct amd_cpudata *cpudata)
> if (ret)
> return ret;
>
> - WRITE_ONCE(cpudata->highest_perf, numerator);
> - WRITE_ONCE(cpudata->max_limit_perf, numerator);
> - WRITE_ONCE(cpudata->nominal_perf, AMD_CPPC_NOMINAL_PERF(cap1));
> - WRITE_ONCE(cpudata->lowest_nonlinear_perf, AMD_CPPC_LOWNONLIN_PERF(cap1));
> - WRITE_ONCE(cpudata->lowest_perf, AMD_CPPC_LOWEST_PERF(cap1));
> + perf.highest_perf = numerator;
> + perf.max_limit_perf = numerator;
> + perf.min_limit_perf = AMD_CPPC_LOWEST_PERF(cap1);
> + perf.nominal_perf = AMD_CPPC_NOMINAL_PERF(cap1);
> + perf.lowest_nonlinear_perf = AMD_CPPC_LOWNONLIN_PERF(cap1);
> + perf.lowest_perf = AMD_CPPC_LOWEST_PERF(cap1);
> + WRITE_ONCE(cpudata->perf, perf);
> WRITE_ONCE(cpudata->prefcore_ranking, AMD_CPPC_HIGHEST_PERF(cap1));
> - WRITE_ONCE(cpudata->min_limit_perf, AMD_CPPC_LOWEST_PERF(cap1));
> return 0;
> }
>
> static int shmem_init_perf(struct amd_cpudata *cpudata)
> {
> struct cppc_perf_caps cppc_perf;
> + union perf_cached perf = cpudata->perf;
> u64 numerator;
>
> int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
> @@ -459,14 +463,14 @@ static int shmem_init_perf(struct amd_cpudata *cpudata)
> if (ret)
> return ret;
>
> - WRITE_ONCE(cpudata->highest_perf, numerator);
> - WRITE_ONCE(cpudata->max_limit_perf, numerator);
> - WRITE_ONCE(cpudata->nominal_perf, cppc_perf.nominal_perf);
> - WRITE_ONCE(cpudata->lowest_nonlinear_perf,
> - cppc_perf.lowest_nonlinear_perf);
> - WRITE_ONCE(cpudata->lowest_perf, cppc_perf.lowest_perf);
> + perf.highest_perf = numerator;
> + perf.max_limit_perf = numerator;
> + perf.min_limit_perf = cppc_perf.lowest_perf;
> + perf.nominal_perf = cppc_perf.nominal_perf;
> + perf.lowest_nonlinear_perf = cppc_perf.lowest_nonlinear_perf;
> + perf.lowest_perf = cppc_perf.lowest_perf;
> + WRITE_ONCE(cpudata->perf, perf);
> WRITE_ONCE(cpudata->prefcore_ranking, cppc_perf.highest_perf);
> - WRITE_ONCE(cpudata->min_limit_perf, cppc_perf.lowest_perf);
>
> if (cppc_state == AMD_PSTATE_ACTIVE)
> return 0;
> @@ -549,14 +553,14 @@ static void amd_pstate_update(struct amd_cpudata *cpudata, u8 min_perf,
> u8 des_perf, u8 max_perf, bool fast_switch, int gov_flags)
> {
> struct cpufreq_policy *policy __free(put_cpufreq_policy) = cpufreq_cpu_get(cpudata->cpu);
> - u8 nominal_perf = READ_ONCE(cpudata->nominal_perf);
> + union perf_cached perf = READ_ONCE(cpudata->perf);
>
> if (!policy)
> return;
>
> des_perf = clamp_t(u8, des_perf, min_perf, max_perf);
>
> - policy->cur = perf_to_freq(cpudata, des_perf);
> + policy->cur = perf_to_freq(perf, cpudata->nominal_freq, des_perf);
>
> if ((cppc_state == AMD_PSTATE_GUIDED) && (gov_flags & CPUFREQ_GOV_DYNAMIC_SWITCHING)) {
> min_perf = des_perf;
> @@ -565,7 +569,7 @@ static void amd_pstate_update(struct amd_cpudata *cpudata, u8 min_perf,
>
> /* limit the max perf when core performance boost feature is disabled */
> if (!cpudata->boost_supported)
> - max_perf = min_t(u8, nominal_perf, max_perf);
> + max_perf = min_t(u8, perf.nominal_perf, max_perf);
>
> if (trace_amd_pstate_perf_enabled() && amd_pstate_sample(cpudata)) {
> trace_amd_pstate_perf(min_perf, des_perf, max_perf, cpudata->freq,
> @@ -602,36 +606,41 @@ static int amd_pstate_verify(struct cpufreq_policy_data *policy_data)
> return 0;
> }
>
> -static int amd_pstate_update_min_max_limit(struct cpufreq_policy *policy)
> +static void amd_pstate_update_min_max_limit(struct cpufreq_policy *policy)
> {
> - u8 max_limit_perf, min_limit_perf;
> struct amd_cpudata *cpudata = policy->driver_data;
> + union perf_cached perf = READ_ONCE(cpudata->perf);
>
> - max_limit_perf = freq_to_perf(cpudata, policy->max);
> - min_limit_perf = freq_to_perf(cpudata, policy->min);
> + if (policy->min == perf_to_freq(perf, cpudata->nominal_freq, perf.min_limit_perf) &&
> + policy->max == perf_to_freq(perf, cpudata->nominal_freq, perf.max_limit_perf))
> + return;
I guess we can remove this check once we reinstate the min/max_limit_freq caching in cpudata as
discussed in patch #2, right?
>
> - if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
> - min_limit_perf = min(cpudata->nominal_perf, max_limit_perf);
> + perf.max_limit_perf = freq_to_perf(perf, cpudata->nominal_freq, policy->max);
> + perf.min_limit_perf = freq_to_perf(perf, cpudata->nominal_freq, policy->min);
>
> - WRITE_ONCE(cpudata->max_limit_perf, max_limit_perf);
> - WRITE_ONCE(cpudata->min_limit_perf, min_limit_perf);
> + if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
> + perf.min_limit_perf = min(perf.nominal_perf, perf.max_limit_perf);
>
> - return 0;
> + WRITE_ONCE(cpudata->perf, perf);
> }
>
> static int amd_pstate_update_freq(struct cpufreq_policy *policy,
> unsigned int target_freq, bool fast_switch)
> {
> struct cpufreq_freqs freqs;
> - struct amd_cpudata *cpudata = policy->driver_data;
> + struct amd_cpudata *cpudata;
> + union perf_cached perf;
> u8 des_perf;
>
> amd_pstate_update_min_max_limit(policy);
>
> + cpudata = policy->driver_data;
Any specific reason why we moved this dereferencing after amd_pstate_update_min_max_limit() ?
> + perf = READ_ONCE(cpudata->perf);
> +
> freqs.old = policy->cur;
> freqs.new = target_freq;
>
> - des_perf = freq_to_perf(cpudata, target_freq);
> + des_perf = freq_to_perf(perf, cpudata->nominal_freq, target_freq);
Personally I preferred the earlier 2 argument format for the helper functions, as the helper
function handled the common dereferencing part, (i.e. cpudata->perf and cpudata->nominal_freq)
>
> WARN_ON(fast_switch && !policy->fast_switch_enabled);
> /*
> @@ -642,8 +651,8 @@ static int amd_pstate_update_freq(struct cpufreq_policy *policy,
> if (!fast_switch)
> cpufreq_freq_transition_begin(policy, &freqs);
>
> - amd_pstate_update(cpudata, cpudata->min_limit_perf, des_perf,
> - cpudata->max_limit_perf, fast_switch,
> + amd_pstate_update(cpudata, perf.min_limit_perf, des_perf,
> + perf.max_limit_perf, fast_switch,
> policy->governor->flags);
>
> if (!fast_switch)
> @@ -672,19 +681,19 @@ static void amd_pstate_adjust_perf(unsigned int cpu,
> unsigned long target_perf,
> unsigned long capacity)
> {
> - u8 max_perf, min_perf, des_perf, cap_perf, min_limit_perf;
> + u8 max_perf, min_perf, des_perf, cap_perf;
> struct cpufreq_policy *policy __free(put_cpufreq_policy) = cpufreq_cpu_get(cpu);
> struct amd_cpudata *cpudata;
> + union perf_cached perf;
>
> if (!policy)
> return;
>
> - cpudata = policy->driver_data;
> -
> amd_pstate_update_min_max_limit(policy);
>
> - cap_perf = READ_ONCE(cpudata->highest_perf);
> - min_limit_perf = READ_ONCE(cpudata->min_limit_perf);
> + cpudata = policy->driver_data;
Similar question as above
> + perf = READ_ONCE(cpudata->perf);
> + cap_perf = perf.highest_perf;
>
> des_perf = cap_perf;
> if (target_perf < capacity)
> @@ -695,10 +704,10 @@ static void amd_pstate_adjust_perf(unsigned int cpu,
> else
> min_perf = cap_perf;
>
> - if (min_perf < min_limit_perf)
> - min_perf = min_limit_perf;
> + if (min_perf < perf.min_limit_perf)
> + min_perf = perf.min_limit_perf;
>
> - max_perf = cpudata->max_limit_perf;
> + max_perf = perf.max_limit_perf;
> if (max_perf < min_perf)
> max_perf = min_perf;
>
> @@ -709,11 +718,12 @@ static void amd_pstate_adjust_perf(unsigned int cpu,
> static int amd_pstate_cpu_boost_update(struct cpufreq_policy *policy, bool on)
> {
> struct amd_cpudata *cpudata = policy->driver_data;
> + union perf_cached perf = READ_ONCE(cpudata->perf);
> u32 nominal_freq, max_freq;
> int ret = 0;
>
> nominal_freq = READ_ONCE(cpudata->nominal_freq);
> - max_freq = perf_to_freq(cpudata, READ_ONCE(cpudata->highest_perf));
> + max_freq = perf_to_freq(perf, cpudata->nominal_freq, perf.highest_perf);
>
> if (on)
> policy->cpuinfo.max_freq = max_freq;
> @@ -884,25 +894,24 @@ static u32 amd_pstate_get_transition_latency(unsigned int cpu)
> }
>
> /*
> - * amd_pstate_init_freq: Initialize the max_freq, min_freq,
> - * nominal_freq and lowest_nonlinear_freq for
> - * the @cpudata object.
> + * amd_pstate_init_freq: Initialize the nominal_freq and lowest_nonlinear_freq
> + * for the @cpudata object.
> *
> - * Requires: highest_perf, lowest_perf, nominal_perf and
> - * lowest_nonlinear_perf members of @cpudata to be
> - * initialized.
> + * Requires: all perf members of @cpudata to be initialized.
> *
> - * Returns 0 on success, non-zero value on failure.
> + * Returns 0 on success, non-zero value on failure.
> */
> static int amd_pstate_init_freq(struct amd_cpudata *cpudata)
> {
> - int ret;
> u32 min_freq, nominal_freq, lowest_nonlinear_freq;
> struct cppc_perf_caps cppc_perf;
> + union perf_cached perf;
> + int ret;
>
> ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
> if (ret)
> return ret;
> + perf = READ_ONCE(cpudata->perf);
>
> if (quirks && quirks->nominal_freq)
> nominal_freq = quirks->nominal_freq;
> @@ -914,6 +923,7 @@ static int amd_pstate_init_freq(struct amd_cpudata *cpudata)
>
> if (quirks && quirks->lowest_freq) {
> min_freq = quirks->lowest_freq;
> + perf.lowest_perf = freq_to_perf(perf, nominal_freq, min_freq);
> } else
> min_freq = cppc_perf.lowest_freq;
>
> @@ -929,7 +939,7 @@ static int amd_pstate_init_freq(struct amd_cpudata *cpudata)
> return -EINVAL;
> }
>
> - lowest_nonlinear_freq = perf_to_freq(cpudata, cpudata->lowest_nonlinear_perf);
> + lowest_nonlinear_freq = perf_to_freq(perf, nominal_freq, perf.lowest_nonlinear_perf);
> WRITE_ONCE(cpudata->lowest_nonlinear_freq, lowest_nonlinear_freq);
>
> if (lowest_nonlinear_freq <= min_freq || lowest_nonlinear_freq > nominal_freq) {
> @@ -944,6 +954,7 @@ static int amd_pstate_init_freq(struct amd_cpudata *cpudata)
> static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
> {
> struct amd_cpudata *cpudata;
> + union perf_cached perf;
> struct device *dev;
> int ret;
>
> @@ -979,8 +990,14 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
> policy->cpuinfo.transition_latency = amd_pstate_get_transition_latency(policy->cpu);
> policy->transition_delay_us = amd_pstate_get_transition_delay_us(policy->cpu);
>
> - policy->cpuinfo.min_freq = policy->min = perf_to_freq(cpudata, cpudata->lowest_perf);
> - policy->cpuinfo.max_freq = policy->max = perf_to_freq(cpudata, cpudata->highest_perf);
> + perf = READ_ONCE(cpudata->perf);
> +
> + policy->cpuinfo.min_freq = policy->min = perf_to_freq(perf,
> + cpudata->nominal_freq,
> + perf.lowest_perf);
> + policy->cpuinfo.max_freq = policy->max = perf_to_freq(perf,
> + cpudata->nominal_freq,
> + perf.highest_perf);
>
> policy->boost_enabled = READ_ONCE(cpudata->boost_supported);
>
> @@ -1061,23 +1078,33 @@ static int amd_pstate_cpu_suspend(struct cpufreq_policy *policy)
> static ssize_t show_amd_pstate_max_freq(struct cpufreq_policy *policy,
> char *buf)
> {
> - struct amd_cpudata *cpudata = policy->driver_data;
> + struct amd_cpudata *cpudata;
> + union perf_cached perf;
> +
> + if (!policy)
> + return -EINVAL;
Do we need to check the policy if it is being passed by a sysfs file access?
I dont see similar check in show_one based sysfs functions in cpufreq.c, they just dereference
it directly.
#define show_one(file_name, object) \
static ssize_t show_##file_name \
(struct cpufreq_policy *policy, char *buf) \
{ \
return sysfs_emit(buf, "%u\n", policy->object); \
}
show_one(cpuinfo_min_freq, cpuinfo.min_freq);
show_one(cpuinfo_max_freq, cpuinfo.max_freq);
show_one(cpuinfo_transition_latency, cpuinfo.transition_latency);
show_one(scaling_min_freq, min);
show_one(scaling_max_freq, max)
>
> + cpudata = policy->driver_data;
> + perf = READ_ONCE(cpudata->perf);
>
> - return sysfs_emit(buf, "%u\n", perf_to_freq(cpudata, READ_ONCE(cpudata->highest_perf)));
> + return sysfs_emit(buf, "%u\n",
> + perf_to_freq(perf, cpudata->nominal_freq, perf.max_limit_perf));
For example, this function was lot cleaner before, as perf_to_freq() handled the common
dereferencing part.
> }
>
> static ssize_t show_amd_pstate_lowest_nonlinear_freq(struct cpufreq_policy *policy,
> char *buf)
> {
> - int freq;
> - struct amd_cpudata *cpudata = policy->driver_data;
> + struct amd_cpudata *cpudata;
> + union perf_cached perf;
> +
> + if (!policy)
> + return -EINVAL;
Similar reason, is this check needed
>
> - freq = READ_ONCE(cpudata->lowest_nonlinear_freq);
> - if (freq < 0)
> - return freq;
> + cpudata = policy->driver_data;
> + perf = READ_ONCE(cpudata->perf);
>
> - return sysfs_emit(buf, "%u\n", freq);
> + return sysfs_emit(buf, "%u\n",
> + perf_to_freq(perf, cpudata->nominal_freq, perf.lowest_nonlinear_perf));
Same comment about doing the dereferencing in helper function.
> }
>
> /*
> @@ -1087,12 +1114,14 @@ static ssize_t show_amd_pstate_lowest_nonlinear_freq(struct cpufreq_policy *poli
> static ssize_t show_amd_pstate_highest_perf(struct cpufreq_policy *policy,
> char *buf)
> {
> - u8 perf;
> - struct amd_cpudata *cpudata = policy->driver_data;
> + struct amd_cpudata *cpudata;
>
> - perf = READ_ONCE(cpudata->highest_perf);
> + if (!policy)
> + return -EINVAL;
Same comment, can we remove if unnecessary
>
> - return sysfs_emit(buf, "%u\n", perf);
> + cpudata = policy->driver_data;
> +
> + return sysfs_emit(buf, "%u\n", cpudata->perf.highest_perf);
> }
>
> static ssize_t show_amd_pstate_prefcore_ranking(struct cpufreq_policy *policy,
> @@ -1423,6 +1452,7 @@ static bool amd_pstate_acpi_pm_profile_undefined(void)
> static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
> {
> struct amd_cpudata *cpudata;
> + union perf_cached perf;
> struct device *dev;
> u64 value;
> int ret;
> @@ -1456,8 +1486,15 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
> if (ret)
> goto free_cpudata1;
>
> - policy->cpuinfo.min_freq = policy->min = perf_to_freq(cpudata, cpudata->lowest_perf);
> - policy->cpuinfo.max_freq = policy->max = perf_to_freq(cpudata, cpudata->highest_perf);
> + perf = READ_ONCE(cpudata->perf);
> +
> + policy->cpuinfo.min_freq = policy->min = perf_to_freq(perf,
> + cpudata->nominal_freq,
> + perf.lowest_perf);
> + policy->cpuinfo.max_freq = policy->max = perf_to_freq(perf,
> + cpudata->nominal_freq,
> + perf.highest_perf);
> +
> /* It will be updated by governor */
> policy->cur = policy->cpuinfo.min_freq;
>
> @@ -1518,6 +1555,7 @@ static void amd_pstate_epp_cpu_exit(struct cpufreq_policy *policy)
> static int amd_pstate_epp_update_limit(struct cpufreq_policy *policy)
> {
> struct amd_cpudata *cpudata = policy->driver_data;
> + union perf_cached perf;
> u8 epp;
>
> amd_pstate_update_min_max_limit(policy);
> @@ -1527,15 +1565,16 @@ static int amd_pstate_epp_update_limit(struct cpufreq_policy *policy)
> else
> epp = READ_ONCE(cpudata->epp_cached);
>
> + perf = READ_ONCE(cpudata->perf);
> if (trace_amd_pstate_epp_perf_enabled()) {
> - trace_amd_pstate_epp_perf(cpudata->cpu, cpudata->highest_perf, epp,
> - cpudata->min_limit_perf,
> - cpudata->max_limit_perf,
> + trace_amd_pstate_epp_perf(cpudata->cpu, perf.highest_perf, epp,
> + perf.min_limit_perf,
> + perf.max_limit_perf,
> policy->boost_enabled);
> }
>
> - return amd_pstate_update_perf(cpudata, cpudata->min_limit_perf, 0U,
> - cpudata->max_limit_perf, epp, false);
> + return amd_pstate_update_perf(cpudata, perf.min_limit_perf, 0U,
> + perf.max_limit_perf, epp, false);
> }
>
> static int amd_pstate_epp_set_policy(struct cpufreq_policy *policy)
> @@ -1567,23 +1606,21 @@ static int amd_pstate_epp_set_policy(struct cpufreq_policy *policy)
> static int amd_pstate_epp_reenable(struct cpufreq_policy *policy)
> {
> struct amd_cpudata *cpudata = policy->driver_data;
> - u8 max_perf;
> + union perf_cached perf = cpudata->perf;
Do we have a rule for when READ_ONCE is needed and when it isnt?
I'm a bit fuzzy on this one, as to how to decide. Any rule of thumb?
> int ret;
>
> ret = amd_pstate_cppc_enable(true);
> if (ret)
> pr_err("failed to enable amd pstate during resume, return %d\n", ret);
>
> - max_perf = READ_ONCE(cpudata->highest_perf);
> -
> if (trace_amd_pstate_epp_perf_enabled()) {
> - trace_amd_pstate_epp_perf(cpudata->cpu, cpudata->highest_perf,
> + trace_amd_pstate_epp_perf(cpudata->cpu, perf.highest_perf,
> cpudata->epp_cached,
> FIELD_GET(AMD_CPPC_MIN_PERF_MASK, cpudata->cppc_req_cached),
> - max_perf, policy->boost_enabled);
> + perf.highest_perf, policy->boost_enabled);
> }
>
> - return amd_pstate_update_perf(cpudata, 0, 0, max_perf, cpudata->epp_cached, false);
> + return amd_pstate_update_perf(cpudata, 0, 0, perf.highest_perf, cpudata->epp_cached, false);
> }
>
> static int amd_pstate_epp_cpu_online(struct cpufreq_policy *policy)
> @@ -1604,22 +1641,21 @@ static int amd_pstate_epp_cpu_online(struct cpufreq_policy *policy)
> static int amd_pstate_epp_cpu_offline(struct cpufreq_policy *policy)
> {
> struct amd_cpudata *cpudata = policy->driver_data;
> - u8 min_perf;
> + union perf_cached perf = cpudata->perf;
>
> if (cpudata->suspended)
> return 0;
>
> - min_perf = READ_ONCE(cpudata->lowest_perf);
> -
> guard(mutex)(&amd_pstate_limits_lock);
>
> if (trace_amd_pstate_epp_perf_enabled()) {
> - trace_amd_pstate_epp_perf(cpudata->cpu, cpudata->highest_perf,
> + trace_amd_pstate_epp_perf(cpudata->cpu, perf.highest_perf,
> AMD_CPPC_EPP_BALANCE_POWERSAVE,
> - min_perf, min_perf, policy->boost_enabled);
> + perf.lowest_perf, perf.lowest_perf,
> + policy->boost_enabled);
> }
>
> - return amd_pstate_update_perf(cpudata, min_perf, 0, min_perf,
> + return amd_pstate_update_perf(cpudata, perf.lowest_perf, 0, perf.lowest_perf,
> AMD_CPPC_EPP_BALANCE_POWERSAVE, false);
> }
>
> diff --git a/drivers/cpufreq/amd-pstate.h b/drivers/cpufreq/amd-pstate.h
> index 472044a1de43b..a140704b97430 100644
> --- a/drivers/cpufreq/amd-pstate.h
> +++ b/drivers/cpufreq/amd-pstate.h
> @@ -13,6 +13,34 @@
> /*********************************************************************
> * AMD P-state INTERFACE *
> *********************************************************************/
> +
> +/**
> + * union perf_cached - A union to cache performance-related data.
> + * @highest_perf: the maximum performance an individual processor may reach,
> + * assuming ideal conditions
> + * For platforms that do not support the preferred core feature, the
> + * highest_pef may be configured with 166 or 255, to avoid max frequency
> + * calculated wrongly. we take the fixed value as the highest_perf.
> + * @nominal_perf: the maximum sustained performance level of the processor,
> + * assuming ideal operating conditions
> + * @lowest_nonlinear_perf: the lowest performance level at which nonlinear power
> + * savings are achieved
> + * @lowest_perf: the absolute lowest performance level of the processor
> + * @min_limit_perf: Cached value of the performance corresponding to policy->min
> + * @max_limit_perf: Cached value of the performance corresponding to policy->max
> + */
> +union perf_cached {
> + struct {
> + u8 highest_perf;
> + u8 nominal_perf;
> + u8 lowest_nonlinear_perf;
> + u8 lowest_perf;
> + u8 min_limit_perf;
> + u8 max_limit_perf;
> + };
> + u64 val;
> +};
> +
> /**
> * struct amd_aperf_mperf
> * @aperf: actual performance frequency clock count
> @@ -30,20 +58,8 @@ struct amd_aperf_mperf {
> * @cpu: CPU number
> * @req: constraint request to apply
> * @cppc_req_cached: cached performance request hints
> - * @highest_perf: the maximum performance an individual processor may reach,
> - * assuming ideal conditions
> - * For platforms that do not support the preferred core feature, the
> - * highest_pef may be configured with 166 or 255, to avoid max frequency
> - * calculated wrongly. we take the fixed value as the highest_perf.
> - * @nominal_perf: the maximum sustained performance level of the processor,
> - * assuming ideal operating conditions
> - * @lowest_nonlinear_perf: the lowest performance level at which nonlinear power
> - * savings are achieved
> - * @lowest_perf: the absolute lowest performance level of the processor
> * @prefcore_ranking: the preferred core ranking, the higher value indicates a higher
> * priority.
> - * @min_limit_perf: Cached value of the performance corresponding to policy->min
> - * @max_limit_perf: Cached value of the performance corresponding to policy->max
> * @nominal_freq: the frequency (in khz) that mapped to nominal_perf
> * @lowest_nonlinear_freq: the frequency (in khz) that mapped to lowest_nonlinear_perf
> * @cur: Difference of Aperf/Mperf/tsc count between last and current sample
> @@ -66,13 +82,9 @@ struct amd_cpudata {
> struct freq_qos_request req[2];
> u64 cppc_req_cached;
>
> - u8 highest_perf;
> - u8 nominal_perf;
> - u8 lowest_nonlinear_perf;
> - u8 lowest_perf;
> + union perf_cached perf;
Can we please add the description for this in the comment above
> +
> u8 prefcore_ranking;
> - u8 min_limit_perf;
> - u8 max_limit_perf;
>
> u32 nominal_freq;
> u32 lowest_nonlinear_freq;
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 01/14] cpufreq/amd-pstate: Show a warning when a CPU fails to setup
2025-02-10 11:59 ` Dhananjay Ugwekar
@ 2025-02-10 13:50 ` Gautham R. Shenoy
2025-02-10 15:13 ` Mario Limonciello
0 siblings, 1 reply; 41+ messages in thread
From: Gautham R. Shenoy @ 2025-02-10 13:50 UTC (permalink / raw)
To: Dhananjay Ugwekar
Cc: Mario Limonciello, Perry Yuan,
open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
On Mon, Feb 10, 2025 at 05:29:24PM +0530, Dhananjay Ugwekar wrote:
> On 2/7/2025 3:26 AM, Mario Limonciello wrote:
> > From: Mario Limonciello <mario.limonciello@amd.com>
> >
> > I came across a system that MSR_AMD_CPPC_CAP1 for some CPUs isn't
> > populated. This is an unexpected behavior that is most likely a
> > BIOS bug. In the event it happens I'd like users to report bugs
> > to properly root cause and get this fixed.
>
> I'm okay with this patch, but I see a similar pr_debug in caller cpufreq_online(),
> so not sure if this is strictly necessary.
>
> 1402 /*
> 1403 * Call driver. From then on the cpufreq must be able
> 1404 * to accept all calls to ->verify and ->setpolicy for this CPU.
> 1405 */
> 1406 ret = cpufreq_driver->init(policy);
> 1407 if (ret) {
> 1408 pr_debug("%s: %d: initialization failed\n", __func__,
> 1409 __LINE__);
> 1410 goto out_free_policy;
> 1411
>
Well, the pr_debug() doesn't always get printed unless the loglevel is
set to debug, which is usually done by the developers and not the end
users.
However you have a point that since the code jumps to free_cpudata1 on
failures from amd_pstate_init_perf(), amd_pstate_init_freq(),
amd_pstate_init_boost_support(), freq_qos_add_request(). So the
pr_warn() doesn't indicate that the failure is due to
MSR_AMD_CPPC_CAP1 not being populated.
--
Thanks and Regards
gautham.
> Thanks,
> Dhananjay
>
> >
> > Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> > ---
> > drivers/cpufreq/amd-pstate.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> > index f425fb7ec77d7..573643654e8d6 100644
> > --- a/drivers/cpufreq/amd-pstate.c
> > +++ b/drivers/cpufreq/amd-pstate.c
> > @@ -1034,6 +1034,7 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
> > free_cpudata2:
> > freq_qos_remove_request(&cpudata->req[0]);
> > free_cpudata1:
> > + pr_warn("Failed to initialize CPU %d: %d\n", policy->cpu, ret);
> > kfree(cpudata);
> > return ret;
> > }
> > @@ -1527,6 +1528,7 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
> > return 0;
> >
> > free_cpudata1:
> > + pr_warn("Failed to initialize CPU %d: %d\n", policy->cpu, ret);
> > kfree(cpudata);
> > return ret;
> > }
>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 01/14] cpufreq/amd-pstate: Show a warning when a CPU fails to setup
2025-02-10 13:50 ` Gautham R. Shenoy
@ 2025-02-10 15:13 ` Mario Limonciello
0 siblings, 0 replies; 41+ messages in thread
From: Mario Limonciello @ 2025-02-10 15:13 UTC (permalink / raw)
To: Gautham R. Shenoy, Dhananjay Ugwekar
Cc: Perry Yuan, open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
On 2/10/2025 07:50, Gautham R. Shenoy wrote:
> On Mon, Feb 10, 2025 at 05:29:24PM +0530, Dhananjay Ugwekar wrote:
>> On 2/7/2025 3:26 AM, Mario Limonciello wrote:
>>> From: Mario Limonciello <mario.limonciello@amd.com>
>>>
>>> I came across a system that MSR_AMD_CPPC_CAP1 for some CPUs isn't
>>> populated. This is an unexpected behavior that is most likely a
>>> BIOS bug. In the event it happens I'd like users to report bugs
>>> to properly root cause and get this fixed.
>>
>> I'm okay with this patch, but I see a similar pr_debug in caller cpufreq_online(),
>> so not sure if this is strictly necessary.
>>
>> 1402 /*
>> 1403 * Call driver. From then on the cpufreq must be able
>> 1404 * to accept all calls to ->verify and ->setpolicy for this CPU.
>> 1405 */
>> 1406 ret = cpufreq_driver->init(policy);
>> 1407 if (ret) {
>> 1408 pr_debug("%s: %d: initialization failed\n", __func__,
>> 1409 __LINE__);
>> 1410 goto out_free_policy;
>> 1411
>>
>
> Well, the pr_debug() doesn't always get printed unless the loglevel is
> set to debug, which is usually done by the developers and not the end
> users.
>
> However you have a point that since the code jumps to free_cpudata1 on
> failures from amd_pstate_init_perf(), amd_pstate_init_freq(),
> amd_pstate_init_boost_support(), freq_qos_add_request(). So the
> pr_warn() doesn't indicate that the failure is due to
> MSR_AMD_CPPC_CAP1 not being populated.
>
Right; my point is that without the warning no one knows there is a problem.
I don't expect we can anticipate all the potential causes, and I want
anyone who hits this to raise a bug and we can ask them to turn on
dynamic debug / ftrace and then triage it.
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 04/14] cpufreq/amd-pstate: Overhaul locking
2025-02-06 21:56 ` [PATCH 04/14] cpufreq/amd-pstate: Overhaul locking Mario Limonciello
@ 2025-02-11 5:02 ` Dhananjay Ugwekar
2025-02-11 21:54 ` Mario Limonciello
0 siblings, 1 reply; 41+ messages in thread
From: Dhananjay Ugwekar @ 2025-02-11 5:02 UTC (permalink / raw)
To: Mario Limonciello, Gautham R . Shenoy, Perry Yuan
Cc: open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
On 2/7/2025 3:26 AM, Mario Limonciello wrote:
> From: Mario Limonciello <mario.limonciello@amd.com>
>
> amd_pstate_cpu_boost_update() and refresh_frequency_limits() both
> update the policy state and have nothing to do with the amd-pstate
> driver itself.
>
> A global "limits" lock doesn't make sense because each CPU can have
> policies changed independently. Instead introduce locks into to the
> cpudata structure and lock each CPU independently.
>
> The remaining "global" driver lock is used to ensure that only one
> entity can change driver modes at a given time.
>
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> ---
> drivers/cpufreq/amd-pstate.c | 27 +++++++++++++++++----------
> drivers/cpufreq/amd-pstate.h | 2 ++
> 2 files changed, 19 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index 77bc6418731ee..dd230ed3b9579 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -196,7 +196,6 @@ static inline int get_mode_idx_from_str(const char *str, size_t size)
> return -EINVAL;
> }
>
> -static DEFINE_MUTEX(amd_pstate_limits_lock);
> static DEFINE_MUTEX(amd_pstate_driver_lock);
>
> static u8 msr_get_epp(struct amd_cpudata *cpudata)
> @@ -283,6 +282,8 @@ static int msr_set_epp(struct amd_cpudata *cpudata, u8 epp)
> u64 value, prev;
> int ret;
>
> + lockdep_assert_held(&cpudata->lock);
After making the perf_cached variable writes atomic, do we still need a cpudata->lock ?
Regards,
Dhananjay
> +
> value = prev = READ_ONCE(cpudata->cppc_req_cached);
> value &= ~AMD_CPPC_EPP_PERF_MASK;
> value |= FIELD_PREP(AMD_CPPC_EPP_PERF_MASK, epp);
> @@ -315,6 +316,8 @@ static int shmem_set_epp(struct amd_cpudata *cpudata, u8 epp)
> int ret;
> struct cppc_perf_ctrls perf_ctrls;
>
> + lockdep_assert_held(&cpudata->lock);
> +
> if (epp == cpudata->epp_cached)
> return 0;
>
> @@ -335,6 +338,8 @@ static int amd_pstate_set_energy_pref_index(struct cpufreq_policy *policy,
> struct amd_cpudata *cpudata = policy->driver_data;
> u8 epp;
>
> + guard(mutex)(&cpudata->lock);
> +
> if (!pref_index)
> epp = cpudata->epp_default;
> else
> @@ -750,7 +755,6 @@ static int amd_pstate_set_boost(struct cpufreq_policy *policy, int state)
> pr_err("Boost mode is not supported by this processor or SBIOS\n");
> return -EOPNOTSUPP;
> }
> - guard(mutex)(&amd_pstate_driver_lock);
>
> ret = amd_pstate_cpu_boost_update(policy, state);
> refresh_frequency_limits(policy);
> @@ -973,6 +977,9 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
>
> cpudata->cpu = policy->cpu;
>
> + mutex_init(&cpudata->lock);
> + guard(mutex)(&cpudata->lock);
> +
> ret = amd_pstate_init_perf(cpudata);
> if (ret)
> goto free_cpudata1;
> @@ -1179,8 +1186,6 @@ static ssize_t store_energy_performance_preference(
> if (ret < 0)
> return -EINVAL;
>
> - guard(mutex)(&amd_pstate_limits_lock);
> -
> ret = amd_pstate_set_energy_pref_index(policy, ret);
>
> return ret ? ret : count;
> @@ -1353,8 +1358,10 @@ int amd_pstate_update_status(const char *buf, size_t size)
> if (mode_idx < 0 || mode_idx >= AMD_PSTATE_MAX)
> return -EINVAL;
>
> - if (mode_state_machine[cppc_state][mode_idx])
> + if (mode_state_machine[cppc_state][mode_idx]) {
> + guard(mutex)(&amd_pstate_driver_lock);
> return mode_state_machine[cppc_state][mode_idx](mode_idx);
> + }
>
> return 0;
> }
> @@ -1375,7 +1382,6 @@ static ssize_t status_store(struct device *a, struct device_attribute *b,
> char *p = memchr(buf, '\n', count);
> int ret;
>
> - guard(mutex)(&amd_pstate_driver_lock);
> ret = amd_pstate_update_status(buf, p ? p - buf : count);
>
> return ret < 0 ? ret : count;
> @@ -1472,6 +1478,9 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
>
> cpudata->cpu = policy->cpu;
>
> + mutex_init(&cpudata->lock);
> + guard(mutex)(&cpudata->lock);
> +
> ret = amd_pstate_init_perf(cpudata);
> if (ret)
> goto free_cpudata1;
> @@ -1558,6 +1567,8 @@ static int amd_pstate_epp_update_limit(struct cpufreq_policy *policy)
> union perf_cached perf;
> u8 epp;
>
> + guard(mutex)(&cpudata->lock);
> +
> amd_pstate_update_min_max_limit(policy);
>
> if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
> @@ -1646,8 +1657,6 @@ static int amd_pstate_epp_cpu_offline(struct cpufreq_policy *policy)
> if (cpudata->suspended)
> return 0;
>
> - guard(mutex)(&amd_pstate_limits_lock);
> -
> if (trace_amd_pstate_epp_perf_enabled()) {
> trace_amd_pstate_epp_perf(cpudata->cpu, perf.highest_perf,
> AMD_CPPC_EPP_BALANCE_POWERSAVE,
> @@ -1684,8 +1693,6 @@ static int amd_pstate_epp_resume(struct cpufreq_policy *policy)
> struct amd_cpudata *cpudata = policy->driver_data;
>
> if (cpudata->suspended) {
> - guard(mutex)(&amd_pstate_limits_lock);
> -
> /* enable amd pstate from suspend state*/
> amd_pstate_epp_reenable(policy);
>
> diff --git a/drivers/cpufreq/amd-pstate.h b/drivers/cpufreq/amd-pstate.h
> index a140704b97430..6d776c3e5712a 100644
> --- a/drivers/cpufreq/amd-pstate.h
> +++ b/drivers/cpufreq/amd-pstate.h
> @@ -96,6 +96,8 @@ struct amd_cpudata {
> bool boost_supported;
> bool hw_prefcore;
>
> + struct mutex lock;
> +
> /* EPP feature related attributes*/
> u8 epp_cached;
> u32 policy;
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 05/14] cpufreq/amd-pstate: Drop `cppc_cap1_cached`
2025-02-06 21:56 ` [PATCH 05/14] cpufreq/amd-pstate: Drop `cppc_cap1_cached` Mario Limonciello
@ 2025-02-11 5:46 ` Dhananjay Ugwekar
0 siblings, 0 replies; 41+ messages in thread
From: Dhananjay Ugwekar @ 2025-02-11 5:46 UTC (permalink / raw)
To: Mario Limonciello, Gautham R . Shenoy, Perry Yuan
Cc: open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
On 2/7/2025 3:26 AM, Mario Limonciello wrote:
> From: Mario Limonciello <mario.limonciello@amd.com>
>
> The `cppc_cap1_cached` variable isn't used at all, there is no
> need to read it at initialization for each CPU.
Looks good to me,
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Thanks,
Dhananjay
>
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> ---
> drivers/cpufreq/amd-pstate.c | 5 -----
> drivers/cpufreq/amd-pstate.h | 2 --
> 2 files changed, 7 deletions(-)
>
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index dd230ed3b9579..71636bd9884c8 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -1529,11 +1529,6 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
> if (ret)
> return ret;
> WRITE_ONCE(cpudata->cppc_req_cached, value);
> -
> - ret = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_CAP1, &value);
> - if (ret)
> - return ret;
> - WRITE_ONCE(cpudata->cppc_cap1_cached, value);
> }
> ret = amd_pstate_set_epp(cpudata, cpudata->epp_default);
> if (ret)
> diff --git a/drivers/cpufreq/amd-pstate.h b/drivers/cpufreq/amd-pstate.h
> index 6d776c3e5712a..7501d30db9953 100644
> --- a/drivers/cpufreq/amd-pstate.h
> +++ b/drivers/cpufreq/amd-pstate.h
> @@ -71,7 +71,6 @@ struct amd_aperf_mperf {
> * AMD P-State driver supports preferred core featue.
> * @epp_cached: Cached CPPC energy-performance preference value
> * @policy: Cpufreq policy value
> - * @cppc_cap1_cached Cached MSR_AMD_CPPC_CAP1 register value
> *
> * The amd_cpudata is key private data for each CPU thread in AMD P-State, and
> * represents all the attributes and goals that AMD P-State requests at runtime.
> @@ -101,7 +100,6 @@ struct amd_cpudata {
> /* EPP feature related attributes*/
> u8 epp_cached;
> u32 policy;
> - u64 cppc_cap1_cached;
> bool suspended;
> u8 epp_default;
> };
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 06/14] cpufreq/amd-pstate-ut: Use _free macro to free put policy
2025-02-06 21:56 ` [PATCH 06/14] cpufreq/amd-pstate-ut: Use _free macro to free put policy Mario Limonciello
@ 2025-02-11 5:58 ` Dhananjay Ugwekar
0 siblings, 0 replies; 41+ messages in thread
From: Dhananjay Ugwekar @ 2025-02-11 5:58 UTC (permalink / raw)
To: Mario Limonciello, Gautham R . Shenoy, Perry Yuan
Cc: open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
On 2/7/2025 3:26 AM, Mario Limonciello wrote:
> From: Mario Limonciello <mario.limonciello@amd.com>
>
> Using a scoped cleanup macro simplifies cleanup code.
Looks good to me,
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Thanks,
Dhananjay
>
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> ---
> drivers/cpufreq/amd-pstate-ut.c | 33 ++++++++++++++-------------------
> 1 file changed, 14 insertions(+), 19 deletions(-)
>
> diff --git a/drivers/cpufreq/amd-pstate-ut.c b/drivers/cpufreq/amd-pstate-ut.c
> index d9ab98c6f56b1..adaa62fb2b04e 100644
> --- a/drivers/cpufreq/amd-pstate-ut.c
> +++ b/drivers/cpufreq/amd-pstate-ut.c
> @@ -26,6 +26,7 @@
> #include <linux/module.h>
> #include <linux/moduleparam.h>
> #include <linux/fs.h>
> +#include <linux/cleanup.h>
>
> #include <acpi/cppc_acpi.h>
>
> @@ -127,10 +128,11 @@ static void amd_pstate_ut_check_perf(u32 index)
> u32 highest_perf = 0, nominal_perf = 0, lowest_nonlinear_perf = 0, lowest_perf = 0;
> u64 cap1 = 0;
> struct cppc_perf_caps cppc_perf;
> - struct cpufreq_policy *policy = NULL;
> struct amd_cpudata *cpudata = NULL;
>
> for_each_possible_cpu(cpu) {
> + struct cpufreq_policy *policy __free(put_cpufreq_policy) = NULL;
> +
> policy = cpufreq_cpu_get(cpu);
> if (!policy)
> break;
> @@ -141,7 +143,7 @@ static void amd_pstate_ut_check_perf(u32 index)
> if (ret) {
> amd_pstate_ut_cases[index].result = AMD_PSTATE_UT_RESULT_FAIL;
> pr_err("%s cppc_get_perf_caps ret=%d error!\n", __func__, ret);
> - goto skip_test;
> + return;
> }
>
> highest_perf = cppc_perf.highest_perf;
> @@ -153,7 +155,7 @@ static void amd_pstate_ut_check_perf(u32 index)
> if (ret) {
> amd_pstate_ut_cases[index].result = AMD_PSTATE_UT_RESULT_FAIL;
> pr_err("%s read CPPC_CAP1 ret=%d error!\n", __func__, ret);
> - goto skip_test;
> + return;
> }
>
> highest_perf = AMD_CPPC_HIGHEST_PERF(cap1);
> @@ -166,7 +168,7 @@ static void amd_pstate_ut_check_perf(u32 index)
> !cpudata->hw_prefcore) {
> pr_err("%s cpu%d highest=%d %d highest perf doesn't match\n",
> __func__, cpu, highest_perf, cpudata->perf.highest_perf);
> - goto skip_test;
> + return;
> }
> if ((nominal_perf != READ_ONCE(cpudata->perf.nominal_perf)) ||
> (lowest_nonlinear_perf != READ_ONCE(cpudata->perf.lowest_nonlinear_perf)) ||
> @@ -176,7 +178,7 @@ static void amd_pstate_ut_check_perf(u32 index)
> __func__, cpu, nominal_perf, cpudata->perf.nominal_perf,
> lowest_nonlinear_perf, cpudata->perf.lowest_nonlinear_perf,
> lowest_perf, cpudata->perf.lowest_perf);
> - goto skip_test;
> + return;
> }
>
> if (!((highest_perf >= nominal_perf) &&
> @@ -187,15 +189,11 @@ static void amd_pstate_ut_check_perf(u32 index)
> pr_err("%s cpu%d highest=%d >= nominal=%d > lowest_nonlinear=%d > lowest=%d > 0, the formula is incorrect!\n",
> __func__, cpu, highest_perf, nominal_perf,
> lowest_nonlinear_perf, lowest_perf);
> - goto skip_test;
> + return;
> }
> - cpufreq_cpu_put(policy);
> }
>
> amd_pstate_ut_cases[index].result = AMD_PSTATE_UT_RESULT_PASS;
> - return;
> -skip_test:
> - cpufreq_cpu_put(policy);
> }
>
> /*
> @@ -206,10 +204,11 @@ static void amd_pstate_ut_check_perf(u32 index)
> static void amd_pstate_ut_check_freq(u32 index)
> {
> int cpu = 0;
> - struct cpufreq_policy *policy = NULL;
> struct amd_cpudata *cpudata = NULL;
>
> for_each_possible_cpu(cpu) {
> + struct cpufreq_policy *policy __free(put_cpufreq_policy) = NULL;
> +
> policy = cpufreq_cpu_get(cpu);
> if (!policy)
> break;
> @@ -223,14 +222,14 @@ static void amd_pstate_ut_check_freq(u32 index)
> pr_err("%s cpu%d max=%d >= nominal=%d > lowest_nonlinear=%d > min=%d > 0, the formula is incorrect!\n",
> __func__, cpu, policy->cpuinfo.max_freq, cpudata->nominal_freq,
> cpudata->lowest_nonlinear_freq, policy->cpuinfo.min_freq);
> - goto skip_test;
> + return;
> }
>
> if (cpudata->lowest_nonlinear_freq != policy->min) {
> amd_pstate_ut_cases[index].result = AMD_PSTATE_UT_RESULT_FAIL;
> pr_err("%s cpu%d cpudata_lowest_nonlinear_freq=%d policy_min=%d, they should be equal!\n",
> __func__, cpu, cpudata->lowest_nonlinear_freq, policy->min);
> - goto skip_test;
> + return;
> }
>
> if (cpudata->boost_supported) {
> @@ -242,20 +241,16 @@ static void amd_pstate_ut_check_freq(u32 index)
> pr_err("%s cpu%d policy_max=%d should be equal cpu_max=%d or cpu_nominal=%d !\n",
> __func__, cpu, policy->max, policy->cpuinfo.max_freq,
> cpudata->nominal_freq);
> - goto skip_test;
> + return;
> }
> } else {
> amd_pstate_ut_cases[index].result = AMD_PSTATE_UT_RESULT_FAIL;
> pr_err("%s cpu%d must support boost!\n", __func__, cpu);
> - goto skip_test;
> + return;
> }
> - cpufreq_cpu_put(policy);
> }
>
> amd_pstate_ut_cases[index].result = AMD_PSTATE_UT_RESULT_PASS;
> - return;
> -skip_test:
> - cpufreq_cpu_put(policy);
> }
>
> static int amd_pstate_set_mode(enum amd_pstate_mode mode)
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 07/14] cpufreq/amd-pstate: Replace all AMD_CPPC_* macros with masks
2025-02-06 21:56 ` [PATCH 07/14] cpufreq/amd-pstate: Replace all AMD_CPPC_* macros with masks Mario Limonciello
@ 2025-02-11 6:16 ` Dhananjay Ugwekar
2025-02-11 18:31 ` Mario Limonciello
0 siblings, 1 reply; 41+ messages in thread
From: Dhananjay Ugwekar @ 2025-02-11 6:16 UTC (permalink / raw)
To: Mario Limonciello, Gautham R . Shenoy, Perry Yuan
Cc: open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
On 2/7/2025 3:26 AM, Mario Limonciello wrote:
> From: Mario Limonciello <mario.limonciello@amd.com>
>
> Bitfield masks are easier to follow and less error prone.
Looks good to me, just one suggestion below, apart from that,
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
>
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> ---
> arch/x86/include/asm/msr-index.h | 18 +++++++++---------
> arch/x86/kernel/acpi/cppc.c | 2 +-
> drivers/cpufreq/amd-pstate-ut.c | 8 ++++----
> drivers/cpufreq/amd-pstate.c | 16 ++++++----------
> 4 files changed, 20 insertions(+), 24 deletions(-)
>
> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
> index 3eadc4d5de837..f77335ebae981 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -700,15 +700,15 @@
> #define MSR_AMD_CPPC_REQ 0xc00102b3
> #define MSR_AMD_CPPC_STATUS 0xc00102b4
>
> -#define AMD_CPPC_LOWEST_PERF(x) (((x) >> 0) & 0xff)
> -#define AMD_CPPC_LOWNONLIN_PERF(x) (((x) >> 8) & 0xff)
> -#define AMD_CPPC_NOMINAL_PERF(x) (((x) >> 16) & 0xff)
> -#define AMD_CPPC_HIGHEST_PERF(x) (((x) >> 24) & 0xff)
> -
> -#define AMD_CPPC_MAX_PERF(x) (((x) & 0xff) << 0)
> -#define AMD_CPPC_MIN_PERF(x) (((x) & 0xff) << 8)
> -#define AMD_CPPC_DES_PERF(x) (((x) & 0xff) << 16)
> -#define AMD_CPPC_ENERGY_PERF_PREF(x) (((x) & 0xff) << 24)
> +#define AMD_CPPC_LOWEST_PERF_MASK GENMASK(7, 0)
How about AMD_CPPC_"CAP"_LOWEST_PERF_MASK and
> +#define AMD_CPPC_LOWNONLIN_PERF_MASK GENMASK(15, 8)
> +#define AMD_CPPC_NOMINAL_PERF_MASK GENMASK(23, 16)
> +#define AMD_CPPC_HIGHEST_PERF_MASK GENMASK(31, 24)
> +
> +#define AMD_CPPC_MAX_PERF_MASK GENMASK(7, 0)
AMD_CPPC_"REQ"_MAX_PERF_MASK, just to indicate these fields
belong to which register? But we can keep it as is, if you think it
would be a mouthful, I'll leave it upto you.
Thanks,
Dhananjay
> +#define AMD_CPPC_MIN_PERF_MASK GENMASK(15, 8)
> +#define AMD_CPPC_DES_PERF_MASK GENMASK(23, 16)
> +#define AMD_CPPC_EPP_PERF_MASK GENMASK(31, 24)
>
> /* AMD Performance Counter Global Status and Control MSRs */
> #define MSR_AMD64_PERF_CNTR_GLOBAL_STATUS 0xc0000300
> diff --git a/arch/x86/kernel/acpi/cppc.c b/arch/x86/kernel/acpi/cppc.c
> index d745dd586303c..d68a4cb0168fa 100644
> --- a/arch/x86/kernel/acpi/cppc.c
> +++ b/arch/x86/kernel/acpi/cppc.c
> @@ -149,7 +149,7 @@ int amd_get_highest_perf(unsigned int cpu, u32 *highest_perf)
> if (ret)
> goto out;
>
> - val = AMD_CPPC_HIGHEST_PERF(val);
> + val = FIELD_GET(AMD_CPPC_HIGHEST_PERF_MASK, val);
> } else {
> ret = cppc_get_highest_perf(cpu, &val);
> if (ret)
> diff --git a/drivers/cpufreq/amd-pstate-ut.c b/drivers/cpufreq/amd-pstate-ut.c
> index adaa62fb2b04e..2595faa492bf1 100644
> --- a/drivers/cpufreq/amd-pstate-ut.c
> +++ b/drivers/cpufreq/amd-pstate-ut.c
> @@ -158,10 +158,10 @@ static void amd_pstate_ut_check_perf(u32 index)
> return;
> }
>
> - highest_perf = AMD_CPPC_HIGHEST_PERF(cap1);
> - nominal_perf = AMD_CPPC_NOMINAL_PERF(cap1);
> - lowest_nonlinear_perf = AMD_CPPC_LOWNONLIN_PERF(cap1);
> - lowest_perf = AMD_CPPC_LOWEST_PERF(cap1);
> + highest_perf = FIELD_GET(AMD_CPPC_HIGHEST_PERF_MASK, cap1);
> + nominal_perf = FIELD_GET(AMD_CPPC_NOMINAL_PERF_MASK, cap1);
> + lowest_nonlinear_perf = FIELD_GET(AMD_CPPC_LOWNONLIN_PERF_MASK, cap1);
> + lowest_perf = FIELD_GET(AMD_CPPC_LOWEST_PERF_MASK, cap1);
> }
>
> if (highest_perf != READ_ONCE(cpudata->perf.highest_perf) &&
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index 71636bd9884c8..cd96443fc117f 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -89,11 +89,6 @@ static bool cppc_enabled;
> static bool amd_pstate_prefcore = true;
> static struct quirk_entry *quirks;
>
> -#define AMD_CPPC_MAX_PERF_MASK GENMASK(7, 0)
> -#define AMD_CPPC_MIN_PERF_MASK GENMASK(15, 8)
> -#define AMD_CPPC_DES_PERF_MASK GENMASK(23, 16)
> -#define AMD_CPPC_EPP_PERF_MASK GENMASK(31, 24)
> -
> /*
> * AMD Energy Preference Performance (EPP)
> * The EPP is used in the CCLK DPM controller to drive
> @@ -445,12 +440,13 @@ static int msr_init_perf(struct amd_cpudata *cpudata)
>
> perf.highest_perf = numerator;
> perf.max_limit_perf = numerator;
> - perf.min_limit_perf = AMD_CPPC_LOWEST_PERF(cap1);
> - perf.nominal_perf = AMD_CPPC_NOMINAL_PERF(cap1);
> - perf.lowest_nonlinear_perf = AMD_CPPC_LOWNONLIN_PERF(cap1);
> - perf.lowest_perf = AMD_CPPC_LOWEST_PERF(cap1);
> + perf.min_limit_perf = FIELD_GET(AMD_CPPC_LOWEST_PERF_MASK, cap1);
> + perf.nominal_perf = FIELD_GET(AMD_CPPC_NOMINAL_PERF_MASK, cap1);
> + perf.lowest_nonlinear_perf = FIELD_GET(AMD_CPPC_LOWNONLIN_PERF_MASK, cap1);
> + perf.lowest_perf = FIELD_GET(AMD_CPPC_LOWEST_PERF_MASK, cap1);
> WRITE_ONCE(cpudata->perf, perf);
> - WRITE_ONCE(cpudata->prefcore_ranking, AMD_CPPC_HIGHEST_PERF(cap1));
> + WRITE_ONCE(cpudata->prefcore_ranking, FIELD_GET(AMD_CPPC_HIGHEST_PERF_MASK, cap1));
> +
> return 0;
> }
>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 08/14] cpufreq/amd-pstate: Cache CPPC request in shared mem case too
2025-02-06 21:56 ` [PATCH 08/14] cpufreq/amd-pstate: Cache CPPC request in shared mem case too Mario Limonciello
@ 2025-02-11 9:18 ` Dhananjay Ugwekar
0 siblings, 0 replies; 41+ messages in thread
From: Dhananjay Ugwekar @ 2025-02-11 9:18 UTC (permalink / raw)
To: Mario Limonciello, Gautham R . Shenoy, Perry Yuan
Cc: open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
On 2/7/2025 3:26 AM, Mario Limonciello wrote:
> From: Mario Limonciello <mario.limonciello@amd.com>
>
> In order to prevent a potential write for shmem_update_perf()
> cache the request into the cppc_req_cached variable normally only
> used for the MSR case.
Looks good to me,
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Thanks,
Dhananjay
>
> This adds symmetry into the code and potentially avoids extra writes.
>
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> ---
> drivers/cpufreq/amd-pstate.c | 22 +++++++++++++++++++++-
> 1 file changed, 21 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index cd96443fc117f..2aa3d5be2efe5 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -502,6 +502,8 @@ static int shmem_update_perf(struct amd_cpudata *cpudata, u8 min_perf,
> u8 des_perf, u8 max_perf, u8 epp, bool fast_switch)
> {
> struct cppc_perf_ctrls perf_ctrls;
> + u64 value, prev;
> + int ret;
>
> if (cppc_state == AMD_PSTATE_ACTIVE) {
> int ret = shmem_set_epp(cpudata, epp);
> @@ -510,11 +512,29 @@ static int shmem_update_perf(struct amd_cpudata *cpudata, u8 min_perf,
> return ret;
> }
>
> + value = prev = READ_ONCE(cpudata->cppc_req_cached);
> +
> + value &= ~(AMD_CPPC_MAX_PERF_MASK | AMD_CPPC_MIN_PERF_MASK |
> + AMD_CPPC_DES_PERF_MASK | AMD_CPPC_EPP_PERF_MASK);
> + value |= FIELD_PREP(AMD_CPPC_MAX_PERF_MASK, max_perf);
> + value |= FIELD_PREP(AMD_CPPC_DES_PERF_MASK, des_perf);
> + value |= FIELD_PREP(AMD_CPPC_MIN_PERF_MASK, min_perf);
> + value |= FIELD_PREP(AMD_CPPC_EPP_PERF_MASK, epp);
> +
> + if (value == prev)
> + return 0;
> +
> perf_ctrls.max_perf = max_perf;
> perf_ctrls.min_perf = min_perf;
> perf_ctrls.desired_perf = des_perf;
>
> - return cppc_set_perf(cpudata->cpu, &perf_ctrls);
> + ret = cppc_set_perf(cpudata->cpu, &perf_ctrls);
> + if (ret)
> + return ret;
> +
> + WRITE_ONCE(cpudata->cppc_req_cached, value);
> +
> + return 0;
> }
>
> static inline bool amd_pstate_sample(struct amd_cpudata *cpudata)
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 10/14] cpufreq/amd-pstate: Update cppc_req_cached for shared mem EPP writes
2025-02-06 21:56 ` [PATCH 10/14] cpufreq/amd-pstate: Update cppc_req_cached for shared mem EPP writes Mario Limonciello
@ 2025-02-11 13:01 ` Dhananjay Ugwekar
0 siblings, 0 replies; 41+ messages in thread
From: Dhananjay Ugwekar @ 2025-02-11 13:01 UTC (permalink / raw)
To: Mario Limonciello, Gautham R . Shenoy, Perry Yuan
Cc: open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
On 2/7/2025 3:26 AM, Mario Limonciello wrote:
> From: Mario Limonciello <mario.limonciello@amd.com>
>
> On EPP only writes update the cached variable so that the min/max
> performance controls don't need to be updated again.
Looks good to me,
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Thanks,
Dhananjay
>
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> ---
> drivers/cpufreq/amd-pstate.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index e66ccfce5893f..754f2d606b371 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -338,6 +338,7 @@ static int shmem_set_epp(struct cpufreq_policy *policy, u8 epp)
> {
> struct amd_cpudata *cpudata = policy->driver_data;
> struct cppc_perf_ctrls perf_ctrls;
> + u64 value;
> int ret;
>
> lockdep_assert_held(&cpudata->lock);
> @@ -366,6 +367,11 @@ static int shmem_set_epp(struct cpufreq_policy *policy, u8 epp)
> }
> WRITE_ONCE(cpudata->epp_cached, epp);
>
> + value = READ_ONCE(cpudata->cppc_req_cached);
> + value &= ~AMD_CPPC_EPP_PERF_MASK;
> + value |= FIELD_PREP(AMD_CPPC_EPP_PERF_MASK, epp);
> + WRITE_ONCE(cpudata->cppc_req_cached, value);
> +
> return ret;
> }
>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 11/14] cpufreq/amd-pstate: Drop debug statements for policy setting
2025-02-06 21:56 ` [PATCH 11/14] cpufreq/amd-pstate: Drop debug statements for policy setting Mario Limonciello
@ 2025-02-11 13:03 ` Dhananjay Ugwekar
0 siblings, 0 replies; 41+ messages in thread
From: Dhananjay Ugwekar @ 2025-02-11 13:03 UTC (permalink / raw)
To: Mario Limonciello, Gautham R . Shenoy, Perry Yuan
Cc: open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
On 2/7/2025 3:26 AM, Mario Limonciello wrote:
> From: Mario Limonciello <mario.limonciello@amd.com>
>
> There are trace events that exist now for all amd-pstate modes that
> will output information right before programming to the hardware.
>
> This makes the existing debug statements unnecessary remaining
> overhead. Drop them.
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Thanks,
Dhananjay
>
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> ---
> drivers/cpufreq/amd-pstate.c | 4 ----
> 1 file changed, 4 deletions(-)
>
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index 754f2d606b371..689de385d06da 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -673,7 +673,6 @@ static int amd_pstate_verify(struct cpufreq_policy_data *policy_data)
> }
>
> cpufreq_verify_within_cpu_limits(policy_data);
> - pr_debug("policy_max =%d, policy_min=%d\n", policy_data->max, policy_data->min);
>
> return 0;
> }
> @@ -1652,9 +1651,6 @@ static int amd_pstate_epp_set_policy(struct cpufreq_policy *policy)
> if (!policy->cpuinfo.max_freq)
> return -ENODEV;
>
> - pr_debug("set_policy: cpuinfo.max %u policy->max %u\n",
> - policy->cpuinfo.max_freq, policy->max);
> -
> cpudata->policy = policy->policy;
>
> ret = amd_pstate_epp_update_limit(policy);
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 12/14] cpufreq/amd-pstate: Cache a pointer to policy in cpudata
2025-02-06 21:56 ` [PATCH 12/14] cpufreq/amd-pstate: Cache a pointer to policy in cpudata Mario Limonciello
@ 2025-02-11 13:13 ` Dhananjay Ugwekar
2025-02-11 19:17 ` Mario Limonciello
0 siblings, 1 reply; 41+ messages in thread
From: Dhananjay Ugwekar @ 2025-02-11 13:13 UTC (permalink / raw)
To: Mario Limonciello, Gautham R . Shenoy, Perry Yuan
Cc: open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
On 2/7/2025 3:26 AM, Mario Limonciello wrote:
> From: Mario Limonciello <mario.limonciello@amd.com>
>
> In order to access the policy from a notification block it will
> need to be stored in cpudata.
This might break the cpufreq_policy ref counting right?, if we cache the pointer
and use it independent of the ref counting framework.
>
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> ---
> drivers/cpufreq/amd-pstate.c | 13 +++++++------
> drivers/cpufreq/amd-pstate.h | 3 ++-
> 2 files changed, 9 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index 689de385d06da..5945b6c7f7e56 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -388,7 +388,7 @@ static int amd_pstate_set_energy_pref_index(struct cpufreq_policy *policy,
> else
> epp = epp_values[pref_index];
>
> - if (epp > 0 && cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
> + if (epp > 0 && policy->policy == CPUFREQ_POLICY_PERFORMANCE) {
> pr_debug("EPP cannot be set under performance policy\n");
> return -EBUSY;
> }
> @@ -689,7 +689,7 @@ static void amd_pstate_update_min_max_limit(struct cpufreq_policy *policy)
> perf.max_limit_perf = freq_to_perf(perf, cpudata->nominal_freq, policy->max);
> perf.min_limit_perf = freq_to_perf(perf, cpudata->nominal_freq, policy->min);
>
> - if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
> + if (policy->policy == CPUFREQ_POLICY_PERFORMANCE)
> perf.min_limit_perf = min(perf.nominal_perf, perf.max_limit_perf);
>
> WRITE_ONCE(cpudata->perf, perf);
> @@ -1042,6 +1042,7 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
> return -ENOMEM;
>
> cpudata->cpu = policy->cpu;
> + cpudata->policy = policy;
>
> mutex_init(&cpudata->lock);
> guard(mutex)(&cpudata->lock);
> @@ -1224,9 +1225,8 @@ static ssize_t show_energy_performance_available_preferences(
> {
> int i = 0;
> int offset = 0;
> - struct amd_cpudata *cpudata = policy->driver_data;
>
> - if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
> + if (policy->policy == CPUFREQ_POLICY_PERFORMANCE)
> return sysfs_emit_at(buf, offset, "%s\n",
> energy_perf_strings[EPP_INDEX_PERFORMANCE]);
>
> @@ -1543,6 +1543,7 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
> return -ENOMEM;
>
> cpudata->cpu = policy->cpu;
> + cpudata->policy = policy;
>
> mutex_init(&cpudata->lock);
> guard(mutex)(&cpudata->lock);
> @@ -1632,7 +1633,7 @@ static int amd_pstate_epp_update_limit(struct cpufreq_policy *policy)
>
> amd_pstate_update_min_max_limit(policy);
>
> - if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
> + if (policy->policy == CPUFREQ_POLICY_PERFORMANCE)
> epp = 0;
> else
> epp = READ_ONCE(cpudata->epp_cached);
> @@ -1651,7 +1652,7 @@ static int amd_pstate_epp_set_policy(struct cpufreq_policy *policy)
> if (!policy->cpuinfo.max_freq)
> return -ENODEV;
>
> - cpudata->policy = policy->policy;
> + cpudata->policy = policy;
>
> ret = amd_pstate_epp_update_limit(policy);
> if (ret)
> diff --git a/drivers/cpufreq/amd-pstate.h b/drivers/cpufreq/amd-pstate.h
> index 7501d30db9953..16ce631a6c3d5 100644
> --- a/drivers/cpufreq/amd-pstate.h
> +++ b/drivers/cpufreq/amd-pstate.h
> @@ -97,9 +97,10 @@ struct amd_cpudata {
>
> struct mutex lock;
>
> + struct cpufreq_policy *policy;
> +
> /* EPP feature related attributes*/
> u8 epp_cached;
> - u32 policy;
> bool suspended;
> u8 epp_default;
> };
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 14/14] cpufreq/amd-pstate: Stop caching EPP
2025-02-06 21:56 ` [PATCH 14/14] cpufreq/amd-pstate: Stop caching EPP Mario Limonciello
@ 2025-02-11 13:27 ` Dhananjay Ugwekar
0 siblings, 0 replies; 41+ messages in thread
From: Dhananjay Ugwekar @ 2025-02-11 13:27 UTC (permalink / raw)
To: Mario Limonciello, Gautham R . Shenoy, Perry Yuan
Cc: open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
On 2/7/2025 3:26 AM, Mario Limonciello wrote:
> From: Mario Limonciello <mario.limonciello@amd.com>
>
> EPP values are cached in the cpudata structure per CPU. This is needless
> though because they are also cached in the CPPC request variable.
>
> Drop the separate cache for EPP values and always reference the CPPC
> request variable when needed.
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Thanks,
Dhananjay
>
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> ---
> drivers/cpufreq/amd-pstate.c | 30 ++++++++++++++++--------------
> drivers/cpufreq/amd-pstate.h | 1 -
> 2 files changed, 16 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index 697fa1b80cf24..38e5e925a7aed 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -268,8 +268,6 @@ static int msr_update_perf(struct cpufreq_policy *policy, u8 min_perf,
> }
>
> WRITE_ONCE(cpudata->cppc_req_cached, value);
> - if (epp != cpudata->epp_cached)
> - WRITE_ONCE(cpudata->epp_cached, epp);
>
> return 0;
> }
> @@ -320,7 +318,6 @@ static int msr_set_epp(struct cpufreq_policy *policy, u8 epp)
> }
>
> /* update both so that msr_update_perf() can effectively check */
> - WRITE_ONCE(cpudata->epp_cached, epp);
> WRITE_ONCE(cpudata->cppc_req_cached, value);
>
> return ret;
> @@ -337,11 +334,14 @@ static int shmem_set_epp(struct cpufreq_policy *policy, u8 epp)
> {
> struct amd_cpudata *cpudata = policy->driver_data;
> struct cppc_perf_ctrls perf_ctrls;
> + u8 epp_cached;
> u64 value;
> int ret;
>
> lockdep_assert_held(&cpudata->lock);
>
> + epp_cached = FIELD_GET(AMD_CPPC_EPP_PERF_MASK, cpudata->cppc_req_cached);
> +
> if (trace_amd_pstate_epp_perf_enabled()) {
> union perf_cached perf = cpudata->perf;
>
> @@ -352,10 +352,10 @@ static int shmem_set_epp(struct cpufreq_policy *policy, u8 epp)
> FIELD_GET(AMD_CPPC_MAX_PERF_MASK,
> cpudata->cppc_req_cached),
> policy->boost_enabled,
> - epp != cpudata->epp_cached);
> + epp != epp_cached);
> }
>
> - if (epp == cpudata->epp_cached)
> + if (epp == epp_cached)
> return 0;
>
> perf_ctrls.energy_perf = epp;
> @@ -364,7 +364,6 @@ static int shmem_set_epp(struct cpufreq_policy *policy, u8 epp)
> pr_debug("failed to set energy perf value (%d)\n", ret);
> return ret;
> }
> - WRITE_ONCE(cpudata->epp_cached, epp);
>
> value = READ_ONCE(cpudata->cppc_req_cached);
> value &= ~AMD_CPPC_EPP_PERF_MASK;
> @@ -1218,9 +1217,11 @@ static ssize_t show_energy_performance_preference(
> struct cpufreq_policy *policy, char *buf)
> {
> struct amd_cpudata *cpudata = policy->driver_data;
> - u8 preference;
> + u8 preference, epp;
> +
> + epp = FIELD_GET(AMD_CPPC_EPP_PERF_MASK, cpudata->cppc_req_cached);
>
> - switch (cpudata->epp_cached) {
> + switch (epp) {
> case AMD_CPPC_EPP_PERFORMANCE:
> preference = EPP_INDEX_PERFORMANCE;
> break;
> @@ -1588,7 +1589,7 @@ static int amd_pstate_epp_update_limit(struct cpufreq_policy *policy)
> if (policy->policy == CPUFREQ_POLICY_PERFORMANCE)
> epp = 0;
> else
> - epp = READ_ONCE(cpudata->epp_cached);
> + epp = FIELD_GET(AMD_CPPC_EPP_PERF_MASK, cpudata->cppc_req_cached);
>
> perf = READ_ONCE(cpudata->perf);
>
> @@ -1624,23 +1625,24 @@ static int amd_pstate_epp_cpu_online(struct cpufreq_policy *policy)
> struct amd_cpudata *cpudata = policy->driver_data;
> union perf_cached perf = cpudata->perf;
> int ret;
> + u8 epp;
> +
> + guard(mutex)(&cpudata->lock);
> +
> + epp = FIELD_GET(AMD_CPPC_EPP_PERF_MASK, cpudata->cppc_req_cached);
>
> pr_debug("AMD CPU Core %d going online\n", cpudata->cpu);
>
> ret = amd_pstate_cppc_enable(policy);
> if (ret)
> return ret;
> -
> - guard(mutex)(&cpudata->lock);
> -
> - ret = amd_pstate_update_perf(policy, 0, 0, perf.highest_perf, cpudata->epp_cached, false);
> + ret = amd_pstate_update_perf(policy, 0, 0, perf.highest_perf, epp, false);
> if (ret)
> return ret;
>
> cpudata->suspended = false;
>
> return 0;
> -
> }
>
> static int amd_pstate_epp_cpu_offline(struct cpufreq_policy *policy)
> diff --git a/drivers/cpufreq/amd-pstate.h b/drivers/cpufreq/amd-pstate.h
> index 16ce631a6c3d5..c95b4081a3ff6 100644
> --- a/drivers/cpufreq/amd-pstate.h
> +++ b/drivers/cpufreq/amd-pstate.h
> @@ -100,7 +100,6 @@ struct amd_cpudata {
> struct cpufreq_policy *policy;
>
> /* EPP feature related attributes*/
> - u8 epp_cached;
> bool suspended;
> u8 epp_default;
> };
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 07/14] cpufreq/amd-pstate: Replace all AMD_CPPC_* macros with masks
2025-02-11 6:16 ` Dhananjay Ugwekar
@ 2025-02-11 18:31 ` Mario Limonciello
0 siblings, 0 replies; 41+ messages in thread
From: Mario Limonciello @ 2025-02-11 18:31 UTC (permalink / raw)
To: Dhananjay Ugwekar, Gautham R . Shenoy, Perry Yuan
Cc: open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
On 2/11/2025 00:16, Dhananjay Ugwekar wrote:
> On 2/7/2025 3:26 AM, Mario Limonciello wrote:
>> From: Mario Limonciello <mario.limonciello@amd.com>
>>
>> Bitfield masks are easier to follow and less error prone.
>
> Looks good to me, just one suggestion below, apart from that,
>
> Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
>
>>
>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>> ---
>> arch/x86/include/asm/msr-index.h | 18 +++++++++---------
>> arch/x86/kernel/acpi/cppc.c | 2 +-
>> drivers/cpufreq/amd-pstate-ut.c | 8 ++++----
>> drivers/cpufreq/amd-pstate.c | 16 ++++++----------
>> 4 files changed, 20 insertions(+), 24 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
>> index 3eadc4d5de837..f77335ebae981 100644
>> --- a/arch/x86/include/asm/msr-index.h
>> +++ b/arch/x86/include/asm/msr-index.h
>> @@ -700,15 +700,15 @@
>> #define MSR_AMD_CPPC_REQ 0xc00102b3
>> #define MSR_AMD_CPPC_STATUS 0xc00102b4
>>
>> -#define AMD_CPPC_LOWEST_PERF(x) (((x) >> 0) & 0xff)
>> -#define AMD_CPPC_LOWNONLIN_PERF(x) (((x) >> 8) & 0xff)
>> -#define AMD_CPPC_NOMINAL_PERF(x) (((x) >> 16) & 0xff)
>> -#define AMD_CPPC_HIGHEST_PERF(x) (((x) >> 24) & 0xff)
>> -
>> -#define AMD_CPPC_MAX_PERF(x) (((x) & 0xff) << 0)
>> -#define AMD_CPPC_MIN_PERF(x) (((x) & 0xff) << 8)
>> -#define AMD_CPPC_DES_PERF(x) (((x) & 0xff) << 16)
>> -#define AMD_CPPC_ENERGY_PERF_PREF(x) (((x) & 0xff) << 24)
>> +#define AMD_CPPC_LOWEST_PERF_MASK GENMASK(7, 0)
>
> How about AMD_CPPC_"CAP"_LOWEST_PERF_MASK and
>
>> +#define AMD_CPPC_LOWNONLIN_PERF_MASK GENMASK(15, 8)
>> +#define AMD_CPPC_NOMINAL_PERF_MASK GENMASK(23, 16)
>> +#define AMD_CPPC_HIGHEST_PERF_MASK GENMASK(31, 24)
>> +
>> +#define AMD_CPPC_MAX_PERF_MASK GENMASK(7, 0)
>
> AMD_CPPC_"REQ"_MAX_PERF_MASK, just to indicate these fields
> belong to which register? But we can keep it as is, if you think it
> would be a mouthful, I'll leave it upto you.
I'll split the difference and include a comment around them to indicate
what they're for.
>
> Thanks,
> Dhananjay
>
>> +#define AMD_CPPC_MIN_PERF_MASK GENMASK(15, 8)
>> +#define AMD_CPPC_DES_PERF_MASK GENMASK(23, 16)
>> +#define AMD_CPPC_EPP_PERF_MASK GENMASK(31, 24)
>>
>> /* AMD Performance Counter Global Status and Control MSRs */
>> #define MSR_AMD64_PERF_CNTR_GLOBAL_STATUS 0xc0000300
>> diff --git a/arch/x86/kernel/acpi/cppc.c b/arch/x86/kernel/acpi/cppc.c
>> index d745dd586303c..d68a4cb0168fa 100644
>> --- a/arch/x86/kernel/acpi/cppc.c
>> +++ b/arch/x86/kernel/acpi/cppc.c
>> @@ -149,7 +149,7 @@ int amd_get_highest_perf(unsigned int cpu, u32 *highest_perf)
>> if (ret)
>> goto out;
>>
>> - val = AMD_CPPC_HIGHEST_PERF(val);
>> + val = FIELD_GET(AMD_CPPC_HIGHEST_PERF_MASK, val);
>> } else {
>> ret = cppc_get_highest_perf(cpu, &val);
>> if (ret)
>> diff --git a/drivers/cpufreq/amd-pstate-ut.c b/drivers/cpufreq/amd-pstate-ut.c
>> index adaa62fb2b04e..2595faa492bf1 100644
>> --- a/drivers/cpufreq/amd-pstate-ut.c
>> +++ b/drivers/cpufreq/amd-pstate-ut.c
>> @@ -158,10 +158,10 @@ static void amd_pstate_ut_check_perf(u32 index)
>> return;
>> }
>>
>> - highest_perf = AMD_CPPC_HIGHEST_PERF(cap1);
>> - nominal_perf = AMD_CPPC_NOMINAL_PERF(cap1);
>> - lowest_nonlinear_perf = AMD_CPPC_LOWNONLIN_PERF(cap1);
>> - lowest_perf = AMD_CPPC_LOWEST_PERF(cap1);
>> + highest_perf = FIELD_GET(AMD_CPPC_HIGHEST_PERF_MASK, cap1);
>> + nominal_perf = FIELD_GET(AMD_CPPC_NOMINAL_PERF_MASK, cap1);
>> + lowest_nonlinear_perf = FIELD_GET(AMD_CPPC_LOWNONLIN_PERF_MASK, cap1);
>> + lowest_perf = FIELD_GET(AMD_CPPC_LOWEST_PERF_MASK, cap1);
>> }
>>
>> if (highest_perf != READ_ONCE(cpudata->perf.highest_perf) &&
>> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
>> index 71636bd9884c8..cd96443fc117f 100644
>> --- a/drivers/cpufreq/amd-pstate.c
>> +++ b/drivers/cpufreq/amd-pstate.c
>> @@ -89,11 +89,6 @@ static bool cppc_enabled;
>> static bool amd_pstate_prefcore = true;
>> static struct quirk_entry *quirks;
>>
>> -#define AMD_CPPC_MAX_PERF_MASK GENMASK(7, 0)
>> -#define AMD_CPPC_MIN_PERF_MASK GENMASK(15, 8)
>> -#define AMD_CPPC_DES_PERF_MASK GENMASK(23, 16)
>> -#define AMD_CPPC_EPP_PERF_MASK GENMASK(31, 24)
>> -
>> /*
>> * AMD Energy Preference Performance (EPP)
>> * The EPP is used in the CCLK DPM controller to drive
>> @@ -445,12 +440,13 @@ static int msr_init_perf(struct amd_cpudata *cpudata)
>>
>> perf.highest_perf = numerator;
>> perf.max_limit_perf = numerator;
>> - perf.min_limit_perf = AMD_CPPC_LOWEST_PERF(cap1);
>> - perf.nominal_perf = AMD_CPPC_NOMINAL_PERF(cap1);
>> - perf.lowest_nonlinear_perf = AMD_CPPC_LOWNONLIN_PERF(cap1);
>> - perf.lowest_perf = AMD_CPPC_LOWEST_PERF(cap1);
>> + perf.min_limit_perf = FIELD_GET(AMD_CPPC_LOWEST_PERF_MASK, cap1);
>> + perf.nominal_perf = FIELD_GET(AMD_CPPC_NOMINAL_PERF_MASK, cap1);
>> + perf.lowest_nonlinear_perf = FIELD_GET(AMD_CPPC_LOWNONLIN_PERF_MASK, cap1);
>> + perf.lowest_perf = FIELD_GET(AMD_CPPC_LOWEST_PERF_MASK, cap1);
>> WRITE_ONCE(cpudata->perf, perf);
>> - WRITE_ONCE(cpudata->prefcore_ranking, AMD_CPPC_HIGHEST_PERF(cap1));
>> + WRITE_ONCE(cpudata->prefcore_ranking, FIELD_GET(AMD_CPPC_HIGHEST_PERF_MASK, cap1));
>> +
>> return 0;
>> }
>>
>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 12/14] cpufreq/amd-pstate: Cache a pointer to policy in cpudata
2025-02-11 13:13 ` Dhananjay Ugwekar
@ 2025-02-11 19:17 ` Mario Limonciello
2025-02-12 3:52 ` Dhananjay Ugwekar
0 siblings, 1 reply; 41+ messages in thread
From: Mario Limonciello @ 2025-02-11 19:17 UTC (permalink / raw)
To: Dhananjay Ugwekar, Gautham R . Shenoy, Perry Yuan
Cc: open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
On 2/11/2025 07:13, Dhananjay Ugwekar wrote:
> On 2/7/2025 3:26 AM, Mario Limonciello wrote:
>> From: Mario Limonciello <mario.limonciello@amd.com>
>>
>> In order to access the policy from a notification block it will
>> need to be stored in cpudata.
>
> This might break the cpufreq_policy ref counting right?, if we cache the pointer
> and use it independent of the ref counting framework.
Would it be reasonable to bump the ref count when we take the pointer?
I'm not sure if this will work properly.
>
>>
>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>> ---
>> drivers/cpufreq/amd-pstate.c | 13 +++++++------
>> drivers/cpufreq/amd-pstate.h | 3 ++-
>> 2 files changed, 9 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
>> index 689de385d06da..5945b6c7f7e56 100644
>> --- a/drivers/cpufreq/amd-pstate.c
>> +++ b/drivers/cpufreq/amd-pstate.c
>> @@ -388,7 +388,7 @@ static int amd_pstate_set_energy_pref_index(struct cpufreq_policy *policy,
>> else
>> epp = epp_values[pref_index];
>>
>> - if (epp > 0 && cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
>> + if (epp > 0 && policy->policy == CPUFREQ_POLICY_PERFORMANCE) {
>> pr_debug("EPP cannot be set under performance policy\n");
>> return -EBUSY;
>> }
>> @@ -689,7 +689,7 @@ static void amd_pstate_update_min_max_limit(struct cpufreq_policy *policy)
>> perf.max_limit_perf = freq_to_perf(perf, cpudata->nominal_freq, policy->max);
>> perf.min_limit_perf = freq_to_perf(perf, cpudata->nominal_freq, policy->min);
>>
>> - if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
>> + if (policy->policy == CPUFREQ_POLICY_PERFORMANCE)
>> perf.min_limit_perf = min(perf.nominal_perf, perf.max_limit_perf);
>>
>> WRITE_ONCE(cpudata->perf, perf);
>> @@ -1042,6 +1042,7 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
>> return -ENOMEM;
>>
>> cpudata->cpu = policy->cpu;
>> + cpudata->policy = policy;
>>
>> mutex_init(&cpudata->lock);
>> guard(mutex)(&cpudata->lock);
>> @@ -1224,9 +1225,8 @@ static ssize_t show_energy_performance_available_preferences(
>> {
>> int i = 0;
>> int offset = 0;
>> - struct amd_cpudata *cpudata = policy->driver_data;
>>
>> - if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
>> + if (policy->policy == CPUFREQ_POLICY_PERFORMANCE)
>> return sysfs_emit_at(buf, offset, "%s\n",
>> energy_perf_strings[EPP_INDEX_PERFORMANCE]);
>>
>> @@ -1543,6 +1543,7 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
>> return -ENOMEM;
>>
>> cpudata->cpu = policy->cpu;
>> + cpudata->policy = policy;
>>
>> mutex_init(&cpudata->lock);
>> guard(mutex)(&cpudata->lock);
>> @@ -1632,7 +1633,7 @@ static int amd_pstate_epp_update_limit(struct cpufreq_policy *policy)
>>
>> amd_pstate_update_min_max_limit(policy);
>>
>> - if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
>> + if (policy->policy == CPUFREQ_POLICY_PERFORMANCE)
>> epp = 0;
>> else
>> epp = READ_ONCE(cpudata->epp_cached);
>> @@ -1651,7 +1652,7 @@ static int amd_pstate_epp_set_policy(struct cpufreq_policy *policy)
>> if (!policy->cpuinfo.max_freq)
>> return -ENODEV;
>>
>> - cpudata->policy = policy->policy;
>> + cpudata->policy = policy;
>>
>> ret = amd_pstate_epp_update_limit(policy);
>> if (ret)
>> diff --git a/drivers/cpufreq/amd-pstate.h b/drivers/cpufreq/amd-pstate.h
>> index 7501d30db9953..16ce631a6c3d5 100644
>> --- a/drivers/cpufreq/amd-pstate.h
>> +++ b/drivers/cpufreq/amd-pstate.h
>> @@ -97,9 +97,10 @@ struct amd_cpudata {
>>
>> struct mutex lock;
>>
>> + struct cpufreq_policy *policy;
>> +
>> /* EPP feature related attributes*/
>> u8 epp_cached;
>> - u32 policy;
>> bool suspended;
>> u8 epp_default;
>> };
>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 04/14] cpufreq/amd-pstate: Overhaul locking
2025-02-11 5:02 ` Dhananjay Ugwekar
@ 2025-02-11 21:54 ` Mario Limonciello
2025-02-12 5:15 ` Dhananjay Ugwekar
0 siblings, 1 reply; 41+ messages in thread
From: Mario Limonciello @ 2025-02-11 21:54 UTC (permalink / raw)
To: Dhananjay Ugwekar, Gautham R . Shenoy, Perry Yuan
Cc: open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
On 2/10/2025 23:02, Dhananjay Ugwekar wrote:
> On 2/7/2025 3:26 AM, Mario Limonciello wrote:
>> From: Mario Limonciello <mario.limonciello@amd.com>
>>
>> amd_pstate_cpu_boost_update() and refresh_frequency_limits() both
>> update the policy state and have nothing to do with the amd-pstate
>> driver itself.
>>
>> A global "limits" lock doesn't make sense because each CPU can have
>> policies changed independently. Instead introduce locks into to the
>> cpudata structure and lock each CPU independently.
>>
>> The remaining "global" driver lock is used to ensure that only one
>> entity can change driver modes at a given time.
>>
>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>> ---
>> drivers/cpufreq/amd-pstate.c | 27 +++++++++++++++++----------
>> drivers/cpufreq/amd-pstate.h | 2 ++
>> 2 files changed, 19 insertions(+), 10 deletions(-)
>>
>> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
>> index 77bc6418731ee..dd230ed3b9579 100644
>> --- a/drivers/cpufreq/amd-pstate.c
>> +++ b/drivers/cpufreq/amd-pstate.c
>> @@ -196,7 +196,6 @@ static inline int get_mode_idx_from_str(const char *str, size_t size)
>> return -EINVAL;
>> }
>>
>> -static DEFINE_MUTEX(amd_pstate_limits_lock);
>> static DEFINE_MUTEX(amd_pstate_driver_lock);
>>
>> static u8 msr_get_epp(struct amd_cpudata *cpudata)
>> @@ -283,6 +282,8 @@ static int msr_set_epp(struct amd_cpudata *cpudata, u8 epp)
>> u64 value, prev;
>> int ret;
>>
>> + lockdep_assert_held(&cpudata->lock);
>
> After making the perf_cached variable writes atomic, do we still need a cpudata->lock ?
My concern was specifically that userspace could interact with multiple
sysfs files that influence the atomic perf variable (and the HW) at the
same time. So you would not have a deterministic behavior if they
raced. But if you take the mutex on all the paths that this could
happen it will be a FIFO.
>
> Regards,
> Dhananjay
>
>> +
>> value = prev = READ_ONCE(cpudata->cppc_req_cached);
>> value &= ~AMD_CPPC_EPP_PERF_MASK;
>> value |= FIELD_PREP(AMD_CPPC_EPP_PERF_MASK, epp);
>> @@ -315,6 +316,8 @@ static int shmem_set_epp(struct amd_cpudata *cpudata, u8 epp)
>> int ret;
>> struct cppc_perf_ctrls perf_ctrls;
>>
>> + lockdep_assert_held(&cpudata->lock);
>> +
>> if (epp == cpudata->epp_cached)
>> return 0;
>>
>> @@ -335,6 +338,8 @@ static int amd_pstate_set_energy_pref_index(struct cpufreq_policy *policy,
>> struct amd_cpudata *cpudata = policy->driver_data;
>> u8 epp;
>>
>> + guard(mutex)(&cpudata->lock);
>> +
>> if (!pref_index)
>> epp = cpudata->epp_default;
>> else
>> @@ -750,7 +755,6 @@ static int amd_pstate_set_boost(struct cpufreq_policy *policy, int state)
>> pr_err("Boost mode is not supported by this processor or SBIOS\n");
>> return -EOPNOTSUPP;
>> }
>> - guard(mutex)(&amd_pstate_driver_lock);
>>
>> ret = amd_pstate_cpu_boost_update(policy, state);
>> refresh_frequency_limits(policy);
>> @@ -973,6 +977,9 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
>>
>> cpudata->cpu = policy->cpu;
>>
>> + mutex_init(&cpudata->lock);
>> + guard(mutex)(&cpudata->lock);
>> +
>> ret = amd_pstate_init_perf(cpudata);
>> if (ret)
>> goto free_cpudata1;
>> @@ -1179,8 +1186,6 @@ static ssize_t store_energy_performance_preference(
>> if (ret < 0)
>> return -EINVAL;
>>
>> - guard(mutex)(&amd_pstate_limits_lock);
>> -
>> ret = amd_pstate_set_energy_pref_index(policy, ret);
>>
>> return ret ? ret : count;
>> @@ -1353,8 +1358,10 @@ int amd_pstate_update_status(const char *buf, size_t size)
>> if (mode_idx < 0 || mode_idx >= AMD_PSTATE_MAX)
>> return -EINVAL;
>>
>> - if (mode_state_machine[cppc_state][mode_idx])
>> + if (mode_state_machine[cppc_state][mode_idx]) {
>> + guard(mutex)(&amd_pstate_driver_lock);
>> return mode_state_machine[cppc_state][mode_idx](mode_idx);
>> + }
>>
>> return 0;
>> }
>> @@ -1375,7 +1382,6 @@ static ssize_t status_store(struct device *a, struct device_attribute *b,
>> char *p = memchr(buf, '\n', count);
>> int ret;
>>
>> - guard(mutex)(&amd_pstate_driver_lock);
>> ret = amd_pstate_update_status(buf, p ? p - buf : count);
>>
>> return ret < 0 ? ret : count;
>> @@ -1472,6 +1478,9 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
>>
>> cpudata->cpu = policy->cpu;
>>
>> + mutex_init(&cpudata->lock);
>> + guard(mutex)(&cpudata->lock);
>> +
>> ret = amd_pstate_init_perf(cpudata);
>> if (ret)
>> goto free_cpudata1;
>> @@ -1558,6 +1567,8 @@ static int amd_pstate_epp_update_limit(struct cpufreq_policy *policy)
>> union perf_cached perf;
>> u8 epp;
>>
>> + guard(mutex)(&cpudata->lock);
>> +
>> amd_pstate_update_min_max_limit(policy);
>>
>> if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
>> @@ -1646,8 +1657,6 @@ static int amd_pstate_epp_cpu_offline(struct cpufreq_policy *policy)
>> if (cpudata->suspended)
>> return 0;
>>
>> - guard(mutex)(&amd_pstate_limits_lock);
>> -
>> if (trace_amd_pstate_epp_perf_enabled()) {
>> trace_amd_pstate_epp_perf(cpudata->cpu, perf.highest_perf,
>> AMD_CPPC_EPP_BALANCE_POWERSAVE,
>> @@ -1684,8 +1693,6 @@ static int amd_pstate_epp_resume(struct cpufreq_policy *policy)
>> struct amd_cpudata *cpudata = policy->driver_data;
>>
>> if (cpudata->suspended) {
>> - guard(mutex)(&amd_pstate_limits_lock);
>> -
>> /* enable amd pstate from suspend state*/
>> amd_pstate_epp_reenable(policy);
>>
>> diff --git a/drivers/cpufreq/amd-pstate.h b/drivers/cpufreq/amd-pstate.h
>> index a140704b97430..6d776c3e5712a 100644
>> --- a/drivers/cpufreq/amd-pstate.h
>> +++ b/drivers/cpufreq/amd-pstate.h
>> @@ -96,6 +96,8 @@ struct amd_cpudata {
>> bool boost_supported;
>> bool hw_prefcore;
>>
>> + struct mutex lock;
>> +
>> /* EPP feature related attributes*/
>> u8 epp_cached;
>> u32 policy;
>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 03/14] cpufreq/amd-pstate: Move perf values into a union
2025-02-10 13:38 ` Dhananjay Ugwekar
@ 2025-02-11 22:14 ` Mario Limonciello
2025-02-12 6:31 ` Dhananjay Ugwekar
0 siblings, 1 reply; 41+ messages in thread
From: Mario Limonciello @ 2025-02-11 22:14 UTC (permalink / raw)
To: Dhananjay Ugwekar, Gautham R . Shenoy, Perry Yuan
Cc: open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
On 2/10/2025 07:38, Dhananjay Ugwekar wrote:
> On 2/7/2025 3:26 AM, Mario Limonciello wrote:
>> From: Mario Limonciello <mario.limonciello@amd.com>
>>
>> By storing perf values in a union all the writes and reads can
>> be done atomically, removing the need for some concurrency protections.
>>
>> While making this change, also drop the cached frequency values,
>> using inline helpers to calculate them on demand from perf value.
>>
>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>> ---
>> drivers/cpufreq/amd-pstate-ut.c | 17 +--
>> drivers/cpufreq/amd-pstate.c | 212 +++++++++++++++++++-------------
>> drivers/cpufreq/amd-pstate.h | 48 +++++---
>> 3 files changed, 163 insertions(+), 114 deletions(-)
>>
>> diff --git a/drivers/cpufreq/amd-pstate-ut.c b/drivers/cpufreq/amd-pstate-ut.c
>> index 445278cf40b61..d9ab98c6f56b1 100644
>> --- a/drivers/cpufreq/amd-pstate-ut.c
>> +++ b/drivers/cpufreq/amd-pstate-ut.c
>> @@ -162,19 +162,20 @@ static void amd_pstate_ut_check_perf(u32 index)
>> lowest_perf = AMD_CPPC_LOWEST_PERF(cap1);
>> }
>>
>> - if (highest_perf != READ_ONCE(cpudata->highest_perf) && !cpudata->hw_prefcore) {
>> + if (highest_perf != READ_ONCE(cpudata->perf.highest_perf) &&
>> + !cpudata->hw_prefcore) {
>> pr_err("%s cpu%d highest=%d %d highest perf doesn't match\n",
>> - __func__, cpu, highest_perf, cpudata->highest_perf);
>> + __func__, cpu, highest_perf, cpudata->perf.highest_perf);
>> goto skip_test;
>> }
>> - if ((nominal_perf != READ_ONCE(cpudata->nominal_perf)) ||
>> - (lowest_nonlinear_perf != READ_ONCE(cpudata->lowest_nonlinear_perf)) ||
>> - (lowest_perf != READ_ONCE(cpudata->lowest_perf))) {
>> + if ((nominal_perf != READ_ONCE(cpudata->perf.nominal_perf)) ||
>> + (lowest_nonlinear_perf != READ_ONCE(cpudata->perf.lowest_nonlinear_perf)) ||
>> + (lowest_perf != READ_ONCE(cpudata->perf.lowest_perf))) {
>
> How about making a local copy of cpudata->perf and using that, instead of dereferencing the
> cpudata pointer multiple times, something like,
Sure
>
> union perf_cached cur_perf = READ_ONCE(cpudata->perf);
> if ((nominal_perf != cur_perf.nominal_perf) ||
> (lowest_nonlinear_perf != cur_perf.lowest_nonlinear_perf)) ||
> (lowest_perf != cur_perf.lowest_perf)) {
>
>> amd_pstate_ut_cases[index].result = AMD_PSTATE_UT_RESULT_FAIL;
>> pr_err("%s cpu%d nominal=%d %d lowest_nonlinear=%d %d lowest=%d %d, they should be equal!\n",
>> - __func__, cpu, nominal_perf, cpudata->nominal_perf,
>> - lowest_nonlinear_perf, cpudata->lowest_nonlinear_perf,
>> - lowest_perf, cpudata->lowest_perf);
>> + __func__, cpu, nominal_perf, cpudata->perf.nominal_perf,
>> + lowest_nonlinear_perf, cpudata->perf.lowest_nonlinear_perf,
>> + lowest_perf, cpudata->perf.lowest_perf);
>> goto skip_test;
>> }
>>
>> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
>> index 668377f55b630..77bc6418731ee 100644
>> --- a/drivers/cpufreq/amd-pstate.c
>> +++ b/drivers/cpufreq/amd-pstate.c
>> @@ -142,18 +142,17 @@ static struct quirk_entry quirk_amd_7k62 = {
>> .lowest_freq = 550,
>> };
>>
>> -static inline u8 freq_to_perf(struct amd_cpudata *cpudata, unsigned int freq_val)
>> +static inline u8 freq_to_perf(union perf_cached perf, u32 nominal_freq, unsigned int freq_val)
>> {
>> - u8 perf_val = DIV_ROUND_UP_ULL((u64)freq_val * cpudata->nominal_perf,
>> - cpudata->nominal_freq);
>> + u8 perf_val = DIV_ROUND_UP_ULL((u64)freq_val * perf.nominal_perf, nominal_freq);
>>
>> - return clamp_t(u8, perf_val, cpudata->lowest_perf, cpudata->highest_perf);
>> + return clamp_t(u8, perf_val, perf.lowest_perf, perf.highest_perf);
>> }
>>
>> -static inline u32 perf_to_freq(struct amd_cpudata *cpudata, u8 perf_val)
>> +static inline u32 perf_to_freq(union perf_cached perf, u32 nominal_freq, u8 perf_val)
>> {
>> - return DIV_ROUND_UP_ULL((u64)cpudata->nominal_freq * perf_val,
>> - cpudata->nominal_perf);
>> + return DIV_ROUND_UP_ULL((u64)nominal_freq * perf_val,
>> + perf.nominal_perf);
>> }
>>
>> static int __init dmi_matched_7k62_bios_bug(const struct dmi_system_id *dmi)
>> @@ -347,7 +346,9 @@ static int amd_pstate_set_energy_pref_index(struct cpufreq_policy *policy,
>> }
>>
>> if (trace_amd_pstate_epp_perf_enabled()) {
>> - trace_amd_pstate_epp_perf(cpudata->cpu, cpudata->highest_perf,
>> + union perf_cached perf = cpudata->perf;
>> +
>> + trace_amd_pstate_epp_perf(cpudata->cpu, perf.highest_perf,
>> epp,
>> FIELD_GET(AMD_CPPC_MIN_PERF_MASK, cpudata->cppc_req_cached),
>> FIELD_GET(AMD_CPPC_MAX_PERF_MASK, cpudata->cppc_req_cached),
>> @@ -425,6 +426,7 @@ static inline int amd_pstate_cppc_enable(bool enable)
>>
>> static int msr_init_perf(struct amd_cpudata *cpudata)
>> {
>> + union perf_cached perf = cpudata->perf;
>> u64 cap1, numerator;
>>
>> int ret = rdmsrl_safe_on_cpu(cpudata->cpu, MSR_AMD_CPPC_CAP1,
>> @@ -436,19 +438,21 @@ static int msr_init_perf(struct amd_cpudata *cpudata)
>> if (ret)
>> return ret;
>>
>> - WRITE_ONCE(cpudata->highest_perf, numerator);
>> - WRITE_ONCE(cpudata->max_limit_perf, numerator);
>> - WRITE_ONCE(cpudata->nominal_perf, AMD_CPPC_NOMINAL_PERF(cap1));
>> - WRITE_ONCE(cpudata->lowest_nonlinear_perf, AMD_CPPC_LOWNONLIN_PERF(cap1));
>> - WRITE_ONCE(cpudata->lowest_perf, AMD_CPPC_LOWEST_PERF(cap1));
>> + perf.highest_perf = numerator;
>> + perf.max_limit_perf = numerator;
>> + perf.min_limit_perf = AMD_CPPC_LOWEST_PERF(cap1);
>> + perf.nominal_perf = AMD_CPPC_NOMINAL_PERF(cap1);
>> + perf.lowest_nonlinear_perf = AMD_CPPC_LOWNONLIN_PERF(cap1);
>> + perf.lowest_perf = AMD_CPPC_LOWEST_PERF(cap1);
>> + WRITE_ONCE(cpudata->perf, perf);
>> WRITE_ONCE(cpudata->prefcore_ranking, AMD_CPPC_HIGHEST_PERF(cap1));
>> - WRITE_ONCE(cpudata->min_limit_perf, AMD_CPPC_LOWEST_PERF(cap1));
>> return 0;
>> }
>>
>> static int shmem_init_perf(struct amd_cpudata *cpudata)
>> {
>> struct cppc_perf_caps cppc_perf;
>> + union perf_cached perf = cpudata->perf;
>> u64 numerator;
>>
>> int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
>> @@ -459,14 +463,14 @@ static int shmem_init_perf(struct amd_cpudata *cpudata)
>> if (ret)
>> return ret;
>>
>> - WRITE_ONCE(cpudata->highest_perf, numerator);
>> - WRITE_ONCE(cpudata->max_limit_perf, numerator);
>> - WRITE_ONCE(cpudata->nominal_perf, cppc_perf.nominal_perf);
>> - WRITE_ONCE(cpudata->lowest_nonlinear_perf,
>> - cppc_perf.lowest_nonlinear_perf);
>> - WRITE_ONCE(cpudata->lowest_perf, cppc_perf.lowest_perf);
>> + perf.highest_perf = numerator;
>> + perf.max_limit_perf = numerator;
>> + perf.min_limit_perf = cppc_perf.lowest_perf;
>> + perf.nominal_perf = cppc_perf.nominal_perf;
>> + perf.lowest_nonlinear_perf = cppc_perf.lowest_nonlinear_perf;
>> + perf.lowest_perf = cppc_perf.lowest_perf;
>> + WRITE_ONCE(cpudata->perf, perf);
>> WRITE_ONCE(cpudata->prefcore_ranking, cppc_perf.highest_perf);
>> - WRITE_ONCE(cpudata->min_limit_perf, cppc_perf.lowest_perf);
>>
>> if (cppc_state == AMD_PSTATE_ACTIVE)
>> return 0;
>> @@ -549,14 +553,14 @@ static void amd_pstate_update(struct amd_cpudata *cpudata, u8 min_perf,
>> u8 des_perf, u8 max_perf, bool fast_switch, int gov_flags)
>> {
>> struct cpufreq_policy *policy __free(put_cpufreq_policy) = cpufreq_cpu_get(cpudata->cpu);
>> - u8 nominal_perf = READ_ONCE(cpudata->nominal_perf);
>> + union perf_cached perf = READ_ONCE(cpudata->perf);
>>
>> if (!policy)
>> return;
>>
>> des_perf = clamp_t(u8, des_perf, min_perf, max_perf);
>>
>> - policy->cur = perf_to_freq(cpudata, des_perf);
>> + policy->cur = perf_to_freq(perf, cpudata->nominal_freq, des_perf);
>>
>> if ((cppc_state == AMD_PSTATE_GUIDED) && (gov_flags & CPUFREQ_GOV_DYNAMIC_SWITCHING)) {
>> min_perf = des_perf;
>> @@ -565,7 +569,7 @@ static void amd_pstate_update(struct amd_cpudata *cpudata, u8 min_perf,
>>
>> /* limit the max perf when core performance boost feature is disabled */
>> if (!cpudata->boost_supported)
>> - max_perf = min_t(u8, nominal_perf, max_perf);
>> + max_perf = min_t(u8, perf.nominal_perf, max_perf);
>>
>> if (trace_amd_pstate_perf_enabled() && amd_pstate_sample(cpudata)) {
>> trace_amd_pstate_perf(min_perf, des_perf, max_perf, cpudata->freq,
>> @@ -602,36 +606,41 @@ static int amd_pstate_verify(struct cpufreq_policy_data *policy_data)
>> return 0;
>> }
>>
>> -static int amd_pstate_update_min_max_limit(struct cpufreq_policy *policy)
>> +static void amd_pstate_update_min_max_limit(struct cpufreq_policy *policy)
>> {
>> - u8 max_limit_perf, min_limit_perf;
>> struct amd_cpudata *cpudata = policy->driver_data;
>> + union perf_cached perf = READ_ONCE(cpudata->perf);
>>
>> - max_limit_perf = freq_to_perf(cpudata, policy->max);
>> - min_limit_perf = freq_to_perf(cpudata, policy->min);
>> + if (policy->min == perf_to_freq(perf, cpudata->nominal_freq, perf.min_limit_perf) &&
>> + policy->max == perf_to_freq(perf, cpudata->nominal_freq, perf.max_limit_perf))
>> + return;
>
> I guess we can remove this check once we reinstate the min/max_limit_freq caching in cpudata as
> discussed in patch #2, right?
>
Yeah
>>
>> - if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
>> - min_limit_perf = min(cpudata->nominal_perf, max_limit_perf);
>> + perf.max_limit_perf = freq_to_perf(perf, cpudata->nominal_freq, policy->max);
>> + perf.min_limit_perf = freq_to_perf(perf, cpudata->nominal_freq, policy->min);
>>
>> - WRITE_ONCE(cpudata->max_limit_perf, max_limit_perf);
>> - WRITE_ONCE(cpudata->min_limit_perf, min_limit_perf);
>> + if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
>> + perf.min_limit_perf = min(perf.nominal_perf, perf.max_limit_perf);
>>
>> - return 0;
>> + WRITE_ONCE(cpudata->perf, perf);
>> }
>>
>> static int amd_pstate_update_freq(struct cpufreq_policy *policy,
>> unsigned int target_freq, bool fast_switch)
>> {
>> struct cpufreq_freqs freqs;
>> - struct amd_cpudata *cpudata = policy->driver_data;
>> + struct amd_cpudata *cpudata;
>> + union perf_cached perf;
>> u8 des_perf;
>>
>> amd_pstate_update_min_max_limit(policy);
>>
>> + cpudata = policy->driver_data;
>
> Any specific reason why we moved this dereferencing after amd_pstate_update_min_max_limit() ?
Closer to the first use.
>
>> + perf = READ_ONCE(cpudata->perf);
>> +
>> freqs.old = policy->cur;
>> freqs.new = target_freq;
>>
>> - des_perf = freq_to_perf(cpudata, target_freq);
>> + des_perf = freq_to_perf(perf, cpudata->nominal_freq, target_freq);
>
> Personally I preferred the earlier 2 argument format for the helper functions, as the helper
> function handled the common dereferencing part, (i.e. cpudata->perf and cpudata->nominal_freq)
Something like this?
static inline u8 freq_to_perf(struct amd_cpudata *cpudata, unsigned int
freq_val)
{
union perf_cached perf = READ_ONCE(cpudata->perf);
u8 perf_val = DIV_ROUND_UP_ULL((u64)freq_val * perf.nominal_perf,
cpudata->nominal_freq);
return clamp_t(u8, perf_val, perf.lowest_perf, perf.highest_perf);
}
As an example in practice of what that turns into with inline code it
should be:
static void amd_pstate_update_min_max_limit(struct cpufreq_policy *policy)
{
struct amd_cpudata *cpudata = policy->driver_data;
union perf_cached perf = READ_ONCE(cpudata->perf);
union perf_cached perf2 = READ_ONCE(cpudata->perf);
union perf_cached perf3 = READ_ONCE(cpudata->perf);
u8 val1 = DIV_ROUND_UP_ULL((u64)policy->max * perf2.nominal_perf,
cpudata->nominal_freq);
u8 val2 = DIV_ROUND_UP_ULL((u64)policy->min * perf2.nominal_perf,
cpudata->nominal_freq);
perf.max_limit_perf = clamp_t(u8, val1, perf2.lowest_perf,
perf2.highest_perf);
perf.min_limit_perf = clamp_t(u8, val2, perf3.lowest_perf,
perf3.highest_perf);
.
.
.
So now that's 3 reads for cpudata->perf in every use.
>
>>
>> WARN_ON(fast_switch && !policy->fast_switch_enabled);
>> /*
>> @@ -642,8 +651,8 @@ static int amd_pstate_update_freq(struct cpufreq_policy *policy,
>> if (!fast_switch)
>> cpufreq_freq_transition_begin(policy, &freqs);
>>
>> - amd_pstate_update(cpudata, cpudata->min_limit_perf, des_perf,
>> - cpudata->max_limit_perf, fast_switch,
>> + amd_pstate_update(cpudata, perf.min_limit_perf, des_perf,
>> + perf.max_limit_perf, fast_switch,
>> policy->governor->flags);
>>
>> if (!fast_switch)
>> @@ -672,19 +681,19 @@ static void amd_pstate_adjust_perf(unsigned int cpu,
>> unsigned long target_perf,
>> unsigned long capacity)
>> {
>> - u8 max_perf, min_perf, des_perf, cap_perf, min_limit_perf;
>> + u8 max_perf, min_perf, des_perf, cap_perf;
>> struct cpufreq_policy *policy __free(put_cpufreq_policy) = cpufreq_cpu_get(cpu);
>> struct amd_cpudata *cpudata;
>> + union perf_cached perf;
>>
>> if (!policy)
>> return;
>>
>> - cpudata = policy->driver_data;
>> -
>> amd_pstate_update_min_max_limit(policy);
>>
>> - cap_perf = READ_ONCE(cpudata->highest_perf);
>> - min_limit_perf = READ_ONCE(cpudata->min_limit_perf);
>> + cpudata = policy->driver_data;
>
> Similar question as above
>
>> + perf = READ_ONCE(cpudata->perf);
>> + cap_perf = perf.highest_perf;
>>
>> des_perf = cap_perf;
>> if (target_perf < capacity)
>> @@ -695,10 +704,10 @@ static void amd_pstate_adjust_perf(unsigned int cpu,
>> else
>> min_perf = cap_perf;
>>
>> - if (min_perf < min_limit_perf)
>> - min_perf = min_limit_perf;
>> + if (min_perf < perf.min_limit_perf)
>> + min_perf = perf.min_limit_perf;
>>
>> - max_perf = cpudata->max_limit_perf;
>> + max_perf = perf.max_limit_perf;
>> if (max_perf < min_perf)
>> max_perf = min_perf;
>>
>> @@ -709,11 +718,12 @@ static void amd_pstate_adjust_perf(unsigned int cpu,
>> static int amd_pstate_cpu_boost_update(struct cpufreq_policy *policy, bool on)
>> {
>> struct amd_cpudata *cpudata = policy->driver_data;
>> + union perf_cached perf = READ_ONCE(cpudata->perf);
>> u32 nominal_freq, max_freq;
>> int ret = 0;
>>
>> nominal_freq = READ_ONCE(cpudata->nominal_freq);
>> - max_freq = perf_to_freq(cpudata, READ_ONCE(cpudata->highest_perf));
>> + max_freq = perf_to_freq(perf, cpudata->nominal_freq, perf.highest_perf);
>>
>> if (on)
>> policy->cpuinfo.max_freq = max_freq;
>> @@ -884,25 +894,24 @@ static u32 amd_pstate_get_transition_latency(unsigned int cpu)
>> }
>>
>> /*
>> - * amd_pstate_init_freq: Initialize the max_freq, min_freq,
>> - * nominal_freq and lowest_nonlinear_freq for
>> - * the @cpudata object.
>> + * amd_pstate_init_freq: Initialize the nominal_freq and lowest_nonlinear_freq
>> + * for the @cpudata object.
>> *
>> - * Requires: highest_perf, lowest_perf, nominal_perf and
>> - * lowest_nonlinear_perf members of @cpudata to be
>> - * initialized.
>> + * Requires: all perf members of @cpudata to be initialized.
>> *
>> - * Returns 0 on success, non-zero value on failure.
>> + * Returns 0 on success, non-zero value on failure.
>> */
>> static int amd_pstate_init_freq(struct amd_cpudata *cpudata)
>> {
>> - int ret;
>> u32 min_freq, nominal_freq, lowest_nonlinear_freq;
>> struct cppc_perf_caps cppc_perf;
>> + union perf_cached perf;
>> + int ret;
>>
>> ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
>> if (ret)
>> return ret;
>> + perf = READ_ONCE(cpudata->perf);
>>
>> if (quirks && quirks->nominal_freq)
>> nominal_freq = quirks->nominal_freq;
>> @@ -914,6 +923,7 @@ static int amd_pstate_init_freq(struct amd_cpudata *cpudata)
>>
>> if (quirks && quirks->lowest_freq) {
>> min_freq = quirks->lowest_freq;
>> + perf.lowest_perf = freq_to_perf(perf, nominal_freq, min_freq);
>> } else
>> min_freq = cppc_perf.lowest_freq;
>>
>> @@ -929,7 +939,7 @@ static int amd_pstate_init_freq(struct amd_cpudata *cpudata)
>> return -EINVAL;
>> }
>>
>> - lowest_nonlinear_freq = perf_to_freq(cpudata, cpudata->lowest_nonlinear_perf);
>> + lowest_nonlinear_freq = perf_to_freq(perf, nominal_freq, perf.lowest_nonlinear_perf);
>> WRITE_ONCE(cpudata->lowest_nonlinear_freq, lowest_nonlinear_freq);
>>
>> if (lowest_nonlinear_freq <= min_freq || lowest_nonlinear_freq > nominal_freq) {
>> @@ -944,6 +954,7 @@ static int amd_pstate_init_freq(struct amd_cpudata *cpudata)
>> static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
>> {
>> struct amd_cpudata *cpudata;
>> + union perf_cached perf;
>> struct device *dev;
>> int ret;
>>
>> @@ -979,8 +990,14 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
>> policy->cpuinfo.transition_latency = amd_pstate_get_transition_latency(policy->cpu);
>> policy->transition_delay_us = amd_pstate_get_transition_delay_us(policy->cpu);
>>
>> - policy->cpuinfo.min_freq = policy->min = perf_to_freq(cpudata, cpudata->lowest_perf);
>> - policy->cpuinfo.max_freq = policy->max = perf_to_freq(cpudata, cpudata->highest_perf);
>> + perf = READ_ONCE(cpudata->perf);
>> +
>> + policy->cpuinfo.min_freq = policy->min = perf_to_freq(perf,
>> + cpudata->nominal_freq,
>> + perf.lowest_perf);
>> + policy->cpuinfo.max_freq = policy->max = perf_to_freq(perf,
>> + cpudata->nominal_freq,
>> + perf.highest_perf);
>>
>> policy->boost_enabled = READ_ONCE(cpudata->boost_supported);
>>
>> @@ -1061,23 +1078,33 @@ static int amd_pstate_cpu_suspend(struct cpufreq_policy *policy)
>> static ssize_t show_amd_pstate_max_freq(struct cpufreq_policy *policy,
>> char *buf)
>> {
>> - struct amd_cpudata *cpudata = policy->driver_data;
>> + struct amd_cpudata *cpudata;
>> + union perf_cached perf;
>> +
>> + if (!policy)
>> + return -EINVAL;
>
> Do we need to check the policy if it is being passed by a sysfs file access?
Good point. Will drop that.
>
> I dont see similar check in show_one based sysfs functions in cpufreq.c, they just dereference
> it directly.
>
> #define show_one(file_name, object) \
> static ssize_t show_##file_name \
> (struct cpufreq_policy *policy, char *buf) \
> { \
> return sysfs_emit(buf, "%u\n", policy->object); \
> }
>
> show_one(cpuinfo_min_freq, cpuinfo.min_freq);
> show_one(cpuinfo_max_freq, cpuinfo.max_freq);
> show_one(cpuinfo_transition_latency, cpuinfo.transition_latency);
> show_one(scaling_min_freq, min);
> show_one(scaling_max_freq, max)
>
>>
>> + cpudata = policy->driver_data;
>> + perf = READ_ONCE(cpudata->perf);
>>
>> - return sysfs_emit(buf, "%u\n", perf_to_freq(cpudata, READ_ONCE(cpudata->highest_perf)));
>> + return sysfs_emit(buf, "%u\n",
>> + perf_to_freq(perf, cpudata->nominal_freq, perf.max_limit_perf));
>
> For example, this function was lot cleaner before, as perf_to_freq() handled the common
> dereferencing part.
Yeah I guess it was a lot cleaner before, just more reads too.
>
>> }
>>
>> static ssize_t show_amd_pstate_lowest_nonlinear_freq(struct cpufreq_policy *policy,
>> char *buf)
>> {
>> - int freq;
>> - struct amd_cpudata *cpudata = policy->driver_data;
>> + struct amd_cpudata *cpudata;
>> + union perf_cached perf;
>> +
>> + if (!policy)
>> + return -EINVAL;
>
> Similar reason, is this check needed
>
>>
>> - freq = READ_ONCE(cpudata->lowest_nonlinear_freq);
>> - if (freq < 0)
>> - return freq;
>> + cpudata = policy->driver_data;
>> + perf = READ_ONCE(cpudata->perf);
>>
>> - return sysfs_emit(buf, "%u\n", freq);
>> + return sysfs_emit(buf, "%u\n",
>> + perf_to_freq(perf, cpudata->nominal_freq, perf.lowest_nonlinear_perf));
>
> Same comment about doing the dereferencing in helper function.
>
>> }
>>
>> /*
>> @@ -1087,12 +1114,14 @@ static ssize_t show_amd_pstate_lowest_nonlinear_freq(struct cpufreq_policy *poli
>> static ssize_t show_amd_pstate_highest_perf(struct cpufreq_policy *policy,
>> char *buf)
>> {
>> - u8 perf;
>> - struct amd_cpudata *cpudata = policy->driver_data;
>> + struct amd_cpudata *cpudata;
>>
>> - perf = READ_ONCE(cpudata->highest_perf);
>> + if (!policy)
>> + return -EINVAL;
>
> Same comment, can we remove if unnecessary
>
>>
>> - return sysfs_emit(buf, "%u\n", perf);
>> + cpudata = policy->driver_data;
>> +
>> + return sysfs_emit(buf, "%u\n", cpudata->perf.highest_perf);
>> }
>>
>> static ssize_t show_amd_pstate_prefcore_ranking(struct cpufreq_policy *policy,
>> @@ -1423,6 +1452,7 @@ static bool amd_pstate_acpi_pm_profile_undefined(void)
>> static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
>> {
>> struct amd_cpudata *cpudata;
>> + union perf_cached perf;
>> struct device *dev;
>> u64 value;
>> int ret;
>> @@ -1456,8 +1486,15 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
>> if (ret)
>> goto free_cpudata1;
>>
>> - policy->cpuinfo.min_freq = policy->min = perf_to_freq(cpudata, cpudata->lowest_perf);
>> - policy->cpuinfo.max_freq = policy->max = perf_to_freq(cpudata, cpudata->highest_perf);
>> + perf = READ_ONCE(cpudata->perf);
>> +
>> + policy->cpuinfo.min_freq = policy->min = perf_to_freq(perf,
>> + cpudata->nominal_freq,
>> + perf.lowest_perf);
>> + policy->cpuinfo.max_freq = policy->max = perf_to_freq(perf,
>> + cpudata->nominal_freq,
>> + perf.highest_perf);
>> +
>> /* It will be updated by governor */
>> policy->cur = policy->cpuinfo.min_freq;
>>
>> @@ -1518,6 +1555,7 @@ static void amd_pstate_epp_cpu_exit(struct cpufreq_policy *policy)
>> static int amd_pstate_epp_update_limit(struct cpufreq_policy *policy)
>> {
>> struct amd_cpudata *cpudata = policy->driver_data;
>> + union perf_cached perf;
>> u8 epp;
>>
>> amd_pstate_update_min_max_limit(policy);
>> @@ -1527,15 +1565,16 @@ static int amd_pstate_epp_update_limit(struct cpufreq_policy *policy)
>> else
>> epp = READ_ONCE(cpudata->epp_cached);
>>
>> + perf = READ_ONCE(cpudata->perf);
>> if (trace_amd_pstate_epp_perf_enabled()) {
>> - trace_amd_pstate_epp_perf(cpudata->cpu, cpudata->highest_perf, epp,
>> - cpudata->min_limit_perf,
>> - cpudata->max_limit_perf,
>> + trace_amd_pstate_epp_perf(cpudata->cpu, perf.highest_perf, epp,
>> + perf.min_limit_perf,
>> + perf.max_limit_perf,
>> policy->boost_enabled);
>> }
>>
>> - return amd_pstate_update_perf(cpudata, cpudata->min_limit_perf, 0U,
>> - cpudata->max_limit_perf, epp, false);
>> + return amd_pstate_update_perf(cpudata, perf.min_limit_perf, 0U,
>> + perf.max_limit_perf, epp, false);
>> }
>>
>> static int amd_pstate_epp_set_policy(struct cpufreq_policy *policy)
>> @@ -1567,23 +1606,21 @@ static int amd_pstate_epp_set_policy(struct cpufreq_policy *policy)
>> static int amd_pstate_epp_reenable(struct cpufreq_policy *policy)
>> {
>> struct amd_cpudata *cpudata = policy->driver_data;
>> - u8 max_perf;
>> + union perf_cached perf = cpudata->perf;
>
> Do we have a rule for when READ_ONCE is needed and when it isnt?
> I'm a bit fuzzy on this one, as to how to decide. Any rule of thumb?
I've been wondering the same thing. Gautham, can you enlighten?
>
>> int ret;
>>
>> ret = amd_pstate_cppc_enable(true);
>> if (ret)
>> pr_err("failed to enable amd pstate during resume, return %d\n", ret);
>>
>> - max_perf = READ_ONCE(cpudata->highest_perf);
>> -
>> if (trace_amd_pstate_epp_perf_enabled()) {
>> - trace_amd_pstate_epp_perf(cpudata->cpu, cpudata->highest_perf,
>> + trace_amd_pstate_epp_perf(cpudata->cpu, perf.highest_perf,
>> cpudata->epp_cached,
>> FIELD_GET(AMD_CPPC_MIN_PERF_MASK, cpudata->cppc_req_cached),
>> - max_perf, policy->boost_enabled);
>> + perf.highest_perf, policy->boost_enabled);
>> }
>>
>> - return amd_pstate_update_perf(cpudata, 0, 0, max_perf, cpudata->epp_cached, false);
>> + return amd_pstate_update_perf(cpudata, 0, 0, perf.highest_perf, cpudata->epp_cached, false);
>> }
>>
>> static int amd_pstate_epp_cpu_online(struct cpufreq_policy *policy)
>> @@ -1604,22 +1641,21 @@ static int amd_pstate_epp_cpu_online(struct cpufreq_policy *policy)
>> static int amd_pstate_epp_cpu_offline(struct cpufreq_policy *policy)
>> {
>> struct amd_cpudata *cpudata = policy->driver_data;
>> - u8 min_perf;
>> + union perf_cached perf = cpudata->perf;
>>
>> if (cpudata->suspended)
>> return 0;
>>
>> - min_perf = READ_ONCE(cpudata->lowest_perf);
>> -
>> guard(mutex)(&amd_pstate_limits_lock);
>>
>> if (trace_amd_pstate_epp_perf_enabled()) {
>> - trace_amd_pstate_epp_perf(cpudata->cpu, cpudata->highest_perf,
>> + trace_amd_pstate_epp_perf(cpudata->cpu, perf.highest_perf,
>> AMD_CPPC_EPP_BALANCE_POWERSAVE,
>> - min_perf, min_perf, policy->boost_enabled);
>> + perf.lowest_perf, perf.lowest_perf,
>> + policy->boost_enabled);
>> }
>>
>> - return amd_pstate_update_perf(cpudata, min_perf, 0, min_perf,
>> + return amd_pstate_update_perf(cpudata, perf.lowest_perf, 0, perf.lowest_perf,
>> AMD_CPPC_EPP_BALANCE_POWERSAVE, false);
>> }
>>
>> diff --git a/drivers/cpufreq/amd-pstate.h b/drivers/cpufreq/amd-pstate.h
>> index 472044a1de43b..a140704b97430 100644
>> --- a/drivers/cpufreq/amd-pstate.h
>> +++ b/drivers/cpufreq/amd-pstate.h
>> @@ -13,6 +13,34 @@
>> /*********************************************************************
>> * AMD P-state INTERFACE *
>> *********************************************************************/
>> +
>> +/**
>> + * union perf_cached - A union to cache performance-related data.
>> + * @highest_perf: the maximum performance an individual processor may reach,
>> + * assuming ideal conditions
>> + * For platforms that do not support the preferred core feature, the
>> + * highest_pef may be configured with 166 or 255, to avoid max frequency
>> + * calculated wrongly. we take the fixed value as the highest_perf.
>> + * @nominal_perf: the maximum sustained performance level of the processor,
>> + * assuming ideal operating conditions
>> + * @lowest_nonlinear_perf: the lowest performance level at which nonlinear power
>> + * savings are achieved
>> + * @lowest_perf: the absolute lowest performance level of the processor
>> + * @min_limit_perf: Cached value of the performance corresponding to policy->min
>> + * @max_limit_perf: Cached value of the performance corresponding to policy->max
>> + */
>> +union perf_cached {
>> + struct {
>> + u8 highest_perf;
>> + u8 nominal_perf;
>> + u8 lowest_nonlinear_perf;
>> + u8 lowest_perf;
>> + u8 min_limit_perf;
>> + u8 max_limit_perf;
>> + };
>> + u64 val;
>> +};
>> +
>> /**
>> * struct amd_aperf_mperf
>> * @aperf: actual performance frequency clock count
>> @@ -30,20 +58,8 @@ struct amd_aperf_mperf {
>> * @cpu: CPU number
>> * @req: constraint request to apply
>> * @cppc_req_cached: cached performance request hints
>> - * @highest_perf: the maximum performance an individual processor may reach,
>> - * assuming ideal conditions
>> - * For platforms that do not support the preferred core feature, the
>> - * highest_pef may be configured with 166 or 255, to avoid max frequency
>> - * calculated wrongly. we take the fixed value as the highest_perf.
>> - * @nominal_perf: the maximum sustained performance level of the processor,
>> - * assuming ideal operating conditions
>> - * @lowest_nonlinear_perf: the lowest performance level at which nonlinear power
>> - * savings are achieved
>> - * @lowest_perf: the absolute lowest performance level of the processor
>> * @prefcore_ranking: the preferred core ranking, the higher value indicates a higher
>> * priority.
>> - * @min_limit_perf: Cached value of the performance corresponding to policy->min
>> - * @max_limit_perf: Cached value of the performance corresponding to policy->max
>> * @nominal_freq: the frequency (in khz) that mapped to nominal_perf
>> * @lowest_nonlinear_freq: the frequency (in khz) that mapped to lowest_nonlinear_perf
>> * @cur: Difference of Aperf/Mperf/tsc count between last and current sample
>> @@ -66,13 +82,9 @@ struct amd_cpudata {
>> struct freq_qos_request req[2];
>> u64 cppc_req_cached;
>>
>> - u8 highest_perf;
>> - u8 nominal_perf;
>> - u8 lowest_nonlinear_perf;
>> - u8 lowest_perf;
>> + union perf_cached perf;
>
> Can we please add the description for this in the comment above
Good catch, yeah.
>
>> +
>> u8 prefcore_ranking;
>> - u8 min_limit_perf;
>> - u8 max_limit_perf;
>>
>> u32 nominal_freq;
>> u32 lowest_nonlinear_freq;
>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 12/14] cpufreq/amd-pstate: Cache a pointer to policy in cpudata
2025-02-11 19:17 ` Mario Limonciello
@ 2025-02-12 3:52 ` Dhananjay Ugwekar
0 siblings, 0 replies; 41+ messages in thread
From: Dhananjay Ugwekar @ 2025-02-12 3:52 UTC (permalink / raw)
To: Mario Limonciello, Gautham R . Shenoy, Perry Yuan
Cc: open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
On 2/12/2025 12:47 AM, Mario Limonciello wrote:
> On 2/11/2025 07:13, Dhananjay Ugwekar wrote:
>> On 2/7/2025 3:26 AM, Mario Limonciello wrote:
>>> From: Mario Limonciello <mario.limonciello@amd.com>
>>>
>>> In order to access the policy from a notification block it will
>>> need to be stored in cpudata.
>>
>> This might break the cpufreq_policy ref counting right?, if we cache the pointer
>> and use it independent of the ref counting framework.
>
> Would it be reasonable to bump the ref count when we take the pointer?
>
> I'm not sure if this will work properly.
One doubt, why cant we get the policy ref normally using the cpufreq_cpu_get(cpudata->cpu)
in the notification block ? I'm not aware of that code, if there is any restriction on this.
>
>>
>>>
>>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>>> ---
>>> drivers/cpufreq/amd-pstate.c | 13 +++++++------
>>> drivers/cpufreq/amd-pstate.h | 3 ++-
>>> 2 files changed, 9 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
>>> index 689de385d06da..5945b6c7f7e56 100644
>>> --- a/drivers/cpufreq/amd-pstate.c
>>> +++ b/drivers/cpufreq/amd-pstate.c
>>> @@ -388,7 +388,7 @@ static int amd_pstate_set_energy_pref_index(struct cpufreq_policy *policy,
>>> else
>>> epp = epp_values[pref_index];
>>> - if (epp > 0 && cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
>>> + if (epp > 0 && policy->policy == CPUFREQ_POLICY_PERFORMANCE) {
>>> pr_debug("EPP cannot be set under performance policy\n");
>>> return -EBUSY;
>>> }
>>> @@ -689,7 +689,7 @@ static void amd_pstate_update_min_max_limit(struct cpufreq_policy *policy)
>>> perf.max_limit_perf = freq_to_perf(perf, cpudata->nominal_freq, policy->max);
>>> perf.min_limit_perf = freq_to_perf(perf, cpudata->nominal_freq, policy->min);
>>> - if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
>>> + if (policy->policy == CPUFREQ_POLICY_PERFORMANCE)
>>> perf.min_limit_perf = min(perf.nominal_perf, perf.max_limit_perf);
>>> WRITE_ONCE(cpudata->perf, perf);
>>> @@ -1042,6 +1042,7 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
>>> return -ENOMEM;
>>> cpudata->cpu = policy->cpu;
>>> + cpudata->policy = policy;
>>> mutex_init(&cpudata->lock);
>>> guard(mutex)(&cpudata->lock);
>>> @@ -1224,9 +1225,8 @@ static ssize_t show_energy_performance_available_preferences(
>>> {
>>> int i = 0;
>>> int offset = 0;
>>> - struct amd_cpudata *cpudata = policy->driver_data;
>>> - if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
>>> + if (policy->policy == CPUFREQ_POLICY_PERFORMANCE)
>>> return sysfs_emit_at(buf, offset, "%s\n",
>>> energy_perf_strings[EPP_INDEX_PERFORMANCE]);
>>> @@ -1543,6 +1543,7 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
>>> return -ENOMEM;
>>> cpudata->cpu = policy->cpu;
>>> + cpudata->policy = policy;
>>> mutex_init(&cpudata->lock);
>>> guard(mutex)(&cpudata->lock);
>>> @@ -1632,7 +1633,7 @@ static int amd_pstate_epp_update_limit(struct cpufreq_policy *policy)
>>> amd_pstate_update_min_max_limit(policy);
>>> - if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
>>> + if (policy->policy == CPUFREQ_POLICY_PERFORMANCE)
>>> epp = 0;
>>> else
>>> epp = READ_ONCE(cpudata->epp_cached);
>>> @@ -1651,7 +1652,7 @@ static int amd_pstate_epp_set_policy(struct cpufreq_policy *policy)
>>> if (!policy->cpuinfo.max_freq)
>>> return -ENODEV;
>>> - cpudata->policy = policy->policy;
>>> + cpudata->policy = policy;
>>> ret = amd_pstate_epp_update_limit(policy);
>>> if (ret)
>>> diff --git a/drivers/cpufreq/amd-pstate.h b/drivers/cpufreq/amd-pstate.h
>>> index 7501d30db9953..16ce631a6c3d5 100644
>>> --- a/drivers/cpufreq/amd-pstate.h
>>> +++ b/drivers/cpufreq/amd-pstate.h
>>> @@ -97,9 +97,10 @@ struct amd_cpudata {
>>> struct mutex lock;
>>> + struct cpufreq_policy *policy;
>>> +
>>> /* EPP feature related attributes*/
>>> u8 epp_cached;
>>> - u32 policy;
>>> bool suspended;
>>> u8 epp_default;
>>> };
>>
>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 04/14] cpufreq/amd-pstate: Overhaul locking
2025-02-11 21:54 ` Mario Limonciello
@ 2025-02-12 5:15 ` Dhananjay Ugwekar
2025-02-12 22:05 ` Mario Limonciello
0 siblings, 1 reply; 41+ messages in thread
From: Dhananjay Ugwekar @ 2025-02-12 5:15 UTC (permalink / raw)
To: Mario Limonciello, Gautham R . Shenoy, Perry Yuan
Cc: open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
On 2/12/2025 3:24 AM, Mario Limonciello wrote:
> On 2/10/2025 23:02, Dhananjay Ugwekar wrote:
>> On 2/7/2025 3:26 AM, Mario Limonciello wrote:
>>> From: Mario Limonciello <mario.limonciello@amd.com>
>>>
>>> amd_pstate_cpu_boost_update() and refresh_frequency_limits() both
>>> update the policy state and have nothing to do with the amd-pstate
>>> driver itself.
>>>
>>> A global "limits" lock doesn't make sense because each CPU can have
>>> policies changed independently. Instead introduce locks into to the
>>> cpudata structure and lock each CPU independently.
>>>
>>> The remaining "global" driver lock is used to ensure that only one
>>> entity can change driver modes at a given time.
>>>
>>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>>> ---
>>> drivers/cpufreq/amd-pstate.c | 27 +++++++++++++++++----------
>>> drivers/cpufreq/amd-pstate.h | 2 ++
>>> 2 files changed, 19 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
>>> index 77bc6418731ee..dd230ed3b9579 100644
>>> --- a/drivers/cpufreq/amd-pstate.c
>>> +++ b/drivers/cpufreq/amd-pstate.c
>>> @@ -196,7 +196,6 @@ static inline int get_mode_idx_from_str(const char *str, size_t size)
>>> return -EINVAL;
>>> }
>>> -static DEFINE_MUTEX(amd_pstate_limits_lock);
>>> static DEFINE_MUTEX(amd_pstate_driver_lock);
>>> static u8 msr_get_epp(struct amd_cpudata *cpudata)
>>> @@ -283,6 +282,8 @@ static int msr_set_epp(struct amd_cpudata *cpudata, u8 epp)
>>> u64 value, prev;
>>> int ret;
>>> + lockdep_assert_held(&cpudata->lock);
>>
>> After making the perf_cached variable writes atomic, do we still need a cpudata->lock ?
>
> My concern was specifically that userspace could interact with multiple sysfs files that influence the atomic perf variable (and the HW) at the same time. So you would not have a deterministic behavior if they raced. But if you take the mutex on all the paths that this could happen it will be a FIFO.
I guess, the lock still wont guarantee the ordering right? It will just ensure that one thread executes
that code path for a specific CPU at a time. And do we even care about the ordering ? I'm having a hard
time thinking of a scenario where we'll need the lock. Can you or Gautham think of any such scenario?
>
>>
>> Regards,
>> Dhananjay
>>
>>> +
>>> value = prev = READ_ONCE(cpudata->cppc_req_cached);
>>> value &= ~AMD_CPPC_EPP_PERF_MASK;
>>> value |= FIELD_PREP(AMD_CPPC_EPP_PERF_MASK, epp);
>>> @@ -315,6 +316,8 @@ static int shmem_set_epp(struct amd_cpudata *cpudata, u8 epp)
>>> int ret;
>>> struct cppc_perf_ctrls perf_ctrls;
>>> + lockdep_assert_held(&cpudata->lock);
>>> +
>>> if (epp == cpudata->epp_cached)
>>> return 0;
>>> @@ -335,6 +338,8 @@ static int amd_pstate_set_energy_pref_index(struct cpufreq_policy *policy,
>>> struct amd_cpudata *cpudata = policy->driver_data;
>>> u8 epp;
>>> + guard(mutex)(&cpudata->lock);
>>> +
>>> if (!pref_index)
>>> epp = cpudata->epp_default;
>>> else
>>> @@ -750,7 +755,6 @@ static int amd_pstate_set_boost(struct cpufreq_policy *policy, int state)
>>> pr_err("Boost mode is not supported by this processor or SBIOS\n");
>>> return -EOPNOTSUPP;
>>> }
>>> - guard(mutex)(&amd_pstate_driver_lock);
>>> ret = amd_pstate_cpu_boost_update(policy, state);
>>> refresh_frequency_limits(policy);
>>> @@ -973,6 +977,9 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
>>> cpudata->cpu = policy->cpu;
>>> + mutex_init(&cpudata->lock);
>>> + guard(mutex)(&cpudata->lock);
>>> +
>>> ret = amd_pstate_init_perf(cpudata);
>>> if (ret)
>>> goto free_cpudata1;
>>> @@ -1179,8 +1186,6 @@ static ssize_t store_energy_performance_preference(
>>> if (ret < 0)
>>> return -EINVAL;
>>> - guard(mutex)(&amd_pstate_limits_lock);
>>> -
>>> ret = amd_pstate_set_energy_pref_index(policy, ret);
>>> return ret ? ret : count;
>>> @@ -1353,8 +1358,10 @@ int amd_pstate_update_status(const char *buf, size_t size)
>>> if (mode_idx < 0 || mode_idx >= AMD_PSTATE_MAX)
>>> return -EINVAL;
>>> - if (mode_state_machine[cppc_state][mode_idx])
>>> + if (mode_state_machine[cppc_state][mode_idx]) {
>>> + guard(mutex)(&amd_pstate_driver_lock);
>>> return mode_state_machine[cppc_state][mode_idx](mode_idx);
>>> + }
>>> return 0;
>>> }
>>> @@ -1375,7 +1382,6 @@ static ssize_t status_store(struct device *a, struct device_attribute *b,
>>> char *p = memchr(buf, '\n', count);
>>> int ret;
>>> - guard(mutex)(&amd_pstate_driver_lock);
>>> ret = amd_pstate_update_status(buf, p ? p - buf : count);
>>> return ret < 0 ? ret : count;
>>> @@ -1472,6 +1478,9 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
>>> cpudata->cpu = policy->cpu;
>>> + mutex_init(&cpudata->lock);
>>> + guard(mutex)(&cpudata->lock);
>>> +
>>> ret = amd_pstate_init_perf(cpudata);
>>> if (ret)
>>> goto free_cpudata1;
>>> @@ -1558,6 +1567,8 @@ static int amd_pstate_epp_update_limit(struct cpufreq_policy *policy)
>>> union perf_cached perf;
>>> u8 epp;
>>> + guard(mutex)(&cpudata->lock);
>>> +
>>> amd_pstate_update_min_max_limit(policy);
>>> if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
>>> @@ -1646,8 +1657,6 @@ static int amd_pstate_epp_cpu_offline(struct cpufreq_policy *policy)
>>> if (cpudata->suspended)
>>> return 0;
>>> - guard(mutex)(&amd_pstate_limits_lock);
>>> -
>>> if (trace_amd_pstate_epp_perf_enabled()) {
>>> trace_amd_pstate_epp_perf(cpudata->cpu, perf.highest_perf,
>>> AMD_CPPC_EPP_BALANCE_POWERSAVE,
>>> @@ -1684,8 +1693,6 @@ static int amd_pstate_epp_resume(struct cpufreq_policy *policy)
>>> struct amd_cpudata *cpudata = policy->driver_data;
>>> if (cpudata->suspended) {
>>> - guard(mutex)(&amd_pstate_limits_lock);
>>> -
>>> /* enable amd pstate from suspend state*/
>>> amd_pstate_epp_reenable(policy);
>>> diff --git a/drivers/cpufreq/amd-pstate.h b/drivers/cpufreq/amd-pstate.h
>>> index a140704b97430..6d776c3e5712a 100644
>>> --- a/drivers/cpufreq/amd-pstate.h
>>> +++ b/drivers/cpufreq/amd-pstate.h
>>> @@ -96,6 +96,8 @@ struct amd_cpudata {
>>> bool boost_supported;
>>> bool hw_prefcore;
>>> + struct mutex lock;
>>> +
>>> /* EPP feature related attributes*/
>>> u8 epp_cached;
>>> u32 policy;
>>
>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 03/14] cpufreq/amd-pstate: Move perf values into a union
2025-02-11 22:14 ` Mario Limonciello
@ 2025-02-12 6:31 ` Dhananjay Ugwekar
2025-02-12 22:03 ` Mario Limonciello
0 siblings, 1 reply; 41+ messages in thread
From: Dhananjay Ugwekar @ 2025-02-12 6:31 UTC (permalink / raw)
To: Mario Limonciello, Gautham R . Shenoy, Perry Yuan
Cc: open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
On 2/12/2025 3:44 AM, Mario Limonciello wrote:
> On 2/10/2025 07:38, Dhananjay Ugwekar wrote:
>> On 2/7/2025 3:26 AM, Mario Limonciello wrote:
>>> From: Mario Limonciello <mario.limonciello@amd.com>
>>>
>>> By storing perf values in a union all the writes and reads can
>>> be done atomically, removing the need for some concurrency protections.
>>>
>>> While making this change, also drop the cached frequency values,
>>> using inline helpers to calculate them on demand from perf value.
>>>
>>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>>> ---
[Snip]
>>> static int amd_pstate_update_freq(struct cpufreq_policy *policy,
>>> unsigned int target_freq, bool fast_switch)
>>> {
>>> struct cpufreq_freqs freqs;
>>> - struct amd_cpudata *cpudata = policy->driver_data;
>>> + struct amd_cpudata *cpudata;
>>> + union perf_cached perf;
>>> u8 des_perf;
>>> amd_pstate_update_min_max_limit(policy);
>>> + cpudata = policy->driver_data;
>>
>> Any specific reason why we moved this dereferencing after amd_pstate_update_min_max_limit() ?
>
> Closer to the first use.
>
>>
>>> + perf = READ_ONCE(cpudata->perf);
>>> +
>>> freqs.old = policy->cur;
>>> freqs.new = target_freq;
>>> - des_perf = freq_to_perf(cpudata, target_freq);
>>> + des_perf = freq_to_perf(perf, cpudata->nominal_freq, target_freq);
>>
>> Personally I preferred the earlier 2 argument format for the helper functions, as the helper
>> function handled the common dereferencing part, (i.e. cpudata->perf and cpudata->nominal_freq)
>
> Something like this?
>
> static inline u8 freq_to_perf(struct amd_cpudata *cpudata, unsigned int freq_val)
> {
> union perf_cached perf = READ_ONCE(cpudata->perf);
> u8 perf_val = DIV_ROUND_UP_ULL((u64)freq_val * perf.nominal_perf, cpudata->nominal_freq);
>
> return clamp_t(u8, perf_val, perf.lowest_perf, perf.highest_perf);
> }
>
> As an example in practice of what that turns into with inline code it should be:
>
> static void amd_pstate_update_min_max_limit(struct cpufreq_policy *policy)
> {
> struct amd_cpudata *cpudata = policy->driver_data;
> union perf_cached perf = READ_ONCE(cpudata->perf);
> union perf_cached perf2 = READ_ONCE(cpudata->perf);
> union perf_cached perf3 = READ_ONCE(cpudata->perf);
> u8 val1 = DIV_ROUND_UP_ULL((u64)policy->max * perf2.nominal_perf, cpudata->nominal_freq);
> u8 val2 = DIV_ROUND_UP_ULL((u64)policy->min * perf2.nominal_perf, cpudata->nominal_freq);
>
> perf.max_limit_perf = clamp_t(u8, val1, perf2.lowest_perf, perf2.highest_perf);
> perf.min_limit_perf = clamp_t(u8, val2, perf3.lowest_perf, perf3.highest_perf);
> .
> .
> .
>
> So now that's 3 reads for cpudata->perf in every use.
Yea, right, its a tradeoff, in clean looking code vs less computations.
I'll leave it upto you, I'm okay either way.
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 09/14] cpufreq/amd-pstate: Move all EPP tracing into *_update_perf and *_set_epp functions
2025-02-06 21:56 ` [PATCH 09/14] cpufreq/amd-pstate: Move all EPP tracing into *_update_perf and *_set_epp functions Mario Limonciello
@ 2025-02-12 6:39 ` Dhananjay Ugwekar
0 siblings, 0 replies; 41+ messages in thread
From: Dhananjay Ugwekar @ 2025-02-12 6:39 UTC (permalink / raw)
To: Mario Limonciello, Gautham R . Shenoy, Perry Yuan
Cc: open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
On 2/7/2025 3:26 AM, Mario Limonciello wrote:
> From: Mario Limonciello <mario.limonciello@amd.com>
>
> The EPP tracing is done by the caller today, but this precludes the
> information about whether the CPPC request has changed.
>
> Move it into the update_perf and set_epp functions and include information
> about whether the request has changed from the last one.
Looks good to me,
Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>
Thanks,
Dhananjay
>
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> ---
> drivers/cpufreq/amd-pstate-trace.h | 13 +++-
> drivers/cpufreq/amd-pstate.c | 119 ++++++++++++++++++-----------
> 2 files changed, 83 insertions(+), 49 deletions(-)
>
> diff --git a/drivers/cpufreq/amd-pstate-trace.h b/drivers/cpufreq/amd-pstate-trace.h
> index f457d4af2c62e..32e1bdc588c52 100644
> --- a/drivers/cpufreq/amd-pstate-trace.h
> +++ b/drivers/cpufreq/amd-pstate-trace.h
> @@ -90,7 +90,8 @@ TRACE_EVENT(amd_pstate_epp_perf,
> u8 epp,
> u8 min_perf,
> u8 max_perf,
> - bool boost
> + bool boost,
> + bool changed
> ),
>
> TP_ARGS(cpu_id,
> @@ -98,7 +99,8 @@ TRACE_EVENT(amd_pstate_epp_perf,
> epp,
> min_perf,
> max_perf,
> - boost),
> + boost,
> + changed),
>
> TP_STRUCT__entry(
> __field(unsigned int, cpu_id)
> @@ -107,6 +109,7 @@ TRACE_EVENT(amd_pstate_epp_perf,
> __field(u8, min_perf)
> __field(u8, max_perf)
> __field(bool, boost)
> + __field(bool, changed)
> ),
>
> TP_fast_assign(
> @@ -116,15 +119,17 @@ TRACE_EVENT(amd_pstate_epp_perf,
> __entry->min_perf = min_perf;
> __entry->max_perf = max_perf;
> __entry->boost = boost;
> + __entry->changed = changed;
> ),
>
> - TP_printk("cpu%u: [%hhu<->%hhu]/%hhu, epp=%hhu, boost=%u",
> + TP_printk("cpu%u: [%hhu<->%hhu]/%hhu, epp=%hhu, boost=%u, changed=%u",
> (unsigned int)__entry->cpu_id,
> (u8)__entry->min_perf,
> (u8)__entry->max_perf,
> (u8)__entry->highest_perf,
> (u8)__entry->epp,
> - (bool)__entry->boost
> + (bool)__entry->boost,
> + (bool)__entry->changed
> )
> );
>
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index 2aa3d5be2efe5..e66ccfce5893f 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -228,9 +228,10 @@ static u8 shmem_get_epp(struct amd_cpudata *cpudata)
> return FIELD_GET(AMD_CPPC_EPP_PERF_MASK, epp);
> }
>
> -static int msr_update_perf(struct amd_cpudata *cpudata, u8 min_perf,
> +static int msr_update_perf(struct cpufreq_policy *policy, u8 min_perf,
> u8 des_perf, u8 max_perf, u8 epp, bool fast_switch)
> {
> + struct amd_cpudata *cpudata = policy->driver_data;
> u64 value, prev;
>
> value = prev = READ_ONCE(cpudata->cppc_req_cached);
> @@ -242,6 +243,18 @@ static int msr_update_perf(struct amd_cpudata *cpudata, u8 min_perf,
> value |= FIELD_PREP(AMD_CPPC_MIN_PERF_MASK, min_perf);
> value |= FIELD_PREP(AMD_CPPC_EPP_PERF_MASK, epp);
>
> + if (trace_amd_pstate_epp_perf_enabled()) {
> + union perf_cached perf = cpudata->perf;
> +
> + trace_amd_pstate_epp_perf(cpudata->cpu,
> + perf.highest_perf,
> + epp,
> + min_perf,
> + max_perf,
> + policy->boost_enabled,
> + value != prev);
> + }
> +
> if (value == prev)
> return 0;
>
> @@ -256,24 +269,26 @@ static int msr_update_perf(struct amd_cpudata *cpudata, u8 min_perf,
> }
>
> WRITE_ONCE(cpudata->cppc_req_cached, value);
> - WRITE_ONCE(cpudata->epp_cached, epp);
> + if (epp != cpudata->epp_cached)
> + WRITE_ONCE(cpudata->epp_cached, epp);
>
> return 0;
> }
>
> DEFINE_STATIC_CALL(amd_pstate_update_perf, msr_update_perf);
>
> -static inline int amd_pstate_update_perf(struct amd_cpudata *cpudata,
> +static inline int amd_pstate_update_perf(struct cpufreq_policy *policy,
> u8 min_perf, u8 des_perf,
> u8 max_perf, u8 epp,
> bool fast_switch)
> {
> - return static_call(amd_pstate_update_perf)(cpudata, min_perf, des_perf,
> + return static_call(amd_pstate_update_perf)(policy, min_perf, des_perf,
> max_perf, epp, fast_switch);
> }
>
> -static int msr_set_epp(struct amd_cpudata *cpudata, u8 epp)
> +static int msr_set_epp(struct cpufreq_policy *policy, u8 epp)
> {
> + struct amd_cpudata *cpudata = policy->driver_data;
> u64 value, prev;
> int ret;
>
> @@ -283,6 +298,19 @@ static int msr_set_epp(struct amd_cpudata *cpudata, u8 epp)
> value &= ~AMD_CPPC_EPP_PERF_MASK;
> value |= FIELD_PREP(AMD_CPPC_EPP_PERF_MASK, epp);
>
> + if (trace_amd_pstate_epp_perf_enabled()) {
> + union perf_cached perf = cpudata->perf;
> +
> + trace_amd_pstate_epp_perf(cpudata->cpu, perf.highest_perf,
> + epp,
> + FIELD_GET(AMD_CPPC_MIN_PERF_MASK,
> + cpudata->cppc_req_cached),
> + FIELD_GET(AMD_CPPC_MAX_PERF_MASK,
> + cpudata->cppc_req_cached),
> + policy->boost_enabled,
> + value != prev);
> + }
> +
> if (value == prev)
> return 0;
>
> @@ -301,18 +329,32 @@ static int msr_set_epp(struct amd_cpudata *cpudata, u8 epp)
>
> DEFINE_STATIC_CALL(amd_pstate_set_epp, msr_set_epp);
>
> -static inline int amd_pstate_set_epp(struct amd_cpudata *cpudata, u8 epp)
> +static inline int amd_pstate_set_epp(struct cpufreq_policy *policy, u8 epp)
> {
> - return static_call(amd_pstate_set_epp)(cpudata, epp);
> + return static_call(amd_pstate_set_epp)(policy, epp);
> }
>
> -static int shmem_set_epp(struct amd_cpudata *cpudata, u8 epp)
> +static int shmem_set_epp(struct cpufreq_policy *policy, u8 epp)
> {
> - int ret;
> + struct amd_cpudata *cpudata = policy->driver_data;
> struct cppc_perf_ctrls perf_ctrls;
> + int ret;
>
> lockdep_assert_held(&cpudata->lock);
>
> + if (trace_amd_pstate_epp_perf_enabled()) {
> + union perf_cached perf = cpudata->perf;
> +
> + trace_amd_pstate_epp_perf(cpudata->cpu, perf.highest_perf,
> + epp,
> + FIELD_GET(AMD_CPPC_MIN_PERF_MASK,
> + cpudata->cppc_req_cached),
> + FIELD_GET(AMD_CPPC_MAX_PERF_MASK,
> + cpudata->cppc_req_cached),
> + policy->boost_enabled,
> + epp != cpudata->epp_cached);
> + }
> +
> if (epp == cpudata->epp_cached)
> return 0;
>
> @@ -345,17 +387,7 @@ static int amd_pstate_set_energy_pref_index(struct cpufreq_policy *policy,
> return -EBUSY;
> }
>
> - if (trace_amd_pstate_epp_perf_enabled()) {
> - union perf_cached perf = cpudata->perf;
> -
> - trace_amd_pstate_epp_perf(cpudata->cpu, perf.highest_perf,
> - epp,
> - FIELD_GET(AMD_CPPC_MIN_PERF_MASK, cpudata->cppc_req_cached),
> - FIELD_GET(AMD_CPPC_MAX_PERF_MASK, cpudata->cppc_req_cached),
> - policy->boost_enabled);
> - }
> -
> - return amd_pstate_set_epp(cpudata, epp);
> + return amd_pstate_set_epp(policy, epp);
> }
>
> static inline int msr_cppc_enable(bool enable)
> @@ -498,15 +530,16 @@ static inline int amd_pstate_init_perf(struct amd_cpudata *cpudata)
> return static_call(amd_pstate_init_perf)(cpudata);
> }
>
> -static int shmem_update_perf(struct amd_cpudata *cpudata, u8 min_perf,
> +static int shmem_update_perf(struct cpufreq_policy *policy, u8 min_perf,
> u8 des_perf, u8 max_perf, u8 epp, bool fast_switch)
> {
> + struct amd_cpudata *cpudata = policy->driver_data;
> struct cppc_perf_ctrls perf_ctrls;
> u64 value, prev;
> int ret;
>
> if (cppc_state == AMD_PSTATE_ACTIVE) {
> - int ret = shmem_set_epp(cpudata, epp);
> + int ret = shmem_set_epp(policy, epp);
>
> if (ret)
> return ret;
> @@ -521,6 +554,18 @@ static int shmem_update_perf(struct amd_cpudata *cpudata, u8 min_perf,
> value |= FIELD_PREP(AMD_CPPC_MIN_PERF_MASK, min_perf);
> value |= FIELD_PREP(AMD_CPPC_EPP_PERF_MASK, epp);
>
> + if (trace_amd_pstate_epp_perf_enabled()) {
> + union perf_cached perf = cpudata->perf;
> +
> + trace_amd_pstate_epp_perf(cpudata->cpu,
> + perf.highest_perf,
> + epp,
> + min_perf,
> + max_perf,
> + policy->boost_enabled,
> + value != prev);
> + }
> +
> if (value == prev)
> return 0;
>
> @@ -598,7 +643,7 @@ static void amd_pstate_update(struct amd_cpudata *cpudata, u8 min_perf,
> cpudata->cpu, fast_switch);
> }
>
> - amd_pstate_update_perf(cpudata, min_perf, des_perf, max_perf, 0, fast_switch);
> + amd_pstate_update_perf(policy, min_perf, des_perf, max_perf, 0, fast_switch);
> }
>
> static int amd_pstate_verify(struct cpufreq_policy_data *policy_data)
> @@ -1546,7 +1591,7 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
> return ret;
> WRITE_ONCE(cpudata->cppc_req_cached, value);
> }
> - ret = amd_pstate_set_epp(cpudata, cpudata->epp_default);
> + ret = amd_pstate_set_epp(policy, cpudata->epp_default);
> if (ret)
> return ret;
>
> @@ -1588,14 +1633,8 @@ static int amd_pstate_epp_update_limit(struct cpufreq_policy *policy)
> epp = READ_ONCE(cpudata->epp_cached);
>
> perf = READ_ONCE(cpudata->perf);
> - if (trace_amd_pstate_epp_perf_enabled()) {
> - trace_amd_pstate_epp_perf(cpudata->cpu, perf.highest_perf, epp,
> - perf.min_limit_perf,
> - perf.max_limit_perf,
> - policy->boost_enabled);
> - }
>
> - return amd_pstate_update_perf(cpudata, perf.min_limit_perf, 0U,
> + return amd_pstate_update_perf(policy, perf.min_limit_perf, 0U,
> perf.max_limit_perf, epp, false);
> }
>
> @@ -1635,14 +1674,9 @@ static int amd_pstate_epp_reenable(struct cpufreq_policy *policy)
> if (ret)
> pr_err("failed to enable amd pstate during resume, return %d\n", ret);
>
> - if (trace_amd_pstate_epp_perf_enabled()) {
> - trace_amd_pstate_epp_perf(cpudata->cpu, perf.highest_perf,
> - cpudata->epp_cached,
> - FIELD_GET(AMD_CPPC_MIN_PERF_MASK, cpudata->cppc_req_cached),
> - perf.highest_perf, policy->boost_enabled);
> - }
> + guard(mutex)(&cpudata->lock);
>
> - return amd_pstate_update_perf(cpudata, 0, 0, perf.highest_perf, cpudata->epp_cached, false);
> + return amd_pstate_update_perf(policy, 0, 0, perf.highest_perf, cpudata->epp_cached, false);
> }
>
> static int amd_pstate_epp_cpu_online(struct cpufreq_policy *policy)
> @@ -1668,14 +1702,9 @@ static int amd_pstate_epp_cpu_offline(struct cpufreq_policy *policy)
> if (cpudata->suspended)
> return 0;
>
> - if (trace_amd_pstate_epp_perf_enabled()) {
> - trace_amd_pstate_epp_perf(cpudata->cpu, perf.highest_perf,
> - AMD_CPPC_EPP_BALANCE_POWERSAVE,
> - perf.lowest_perf, perf.lowest_perf,
> - policy->boost_enabled);
> - }
> + guard(mutex)(&cpudata->lock);
>
> - return amd_pstate_update_perf(cpudata, perf.lowest_perf, 0, perf.lowest_perf,
> + return amd_pstate_update_perf(policy, perf.lowest_perf, 0, perf.lowest_perf,
> AMD_CPPC_EPP_BALANCE_POWERSAVE, false);
> }
>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 03/14] cpufreq/amd-pstate: Move perf values into a union
2025-02-12 6:31 ` Dhananjay Ugwekar
@ 2025-02-12 22:03 ` Mario Limonciello
0 siblings, 0 replies; 41+ messages in thread
From: Mario Limonciello @ 2025-02-12 22:03 UTC (permalink / raw)
To: Dhananjay Ugwekar, Gautham R . Shenoy, Perry Yuan
Cc: open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
On 2/12/2025 00:31, Dhananjay Ugwekar wrote:
> On 2/12/2025 3:44 AM, Mario Limonciello wrote:
>> On 2/10/2025 07:38, Dhananjay Ugwekar wrote:
>>> On 2/7/2025 3:26 AM, Mario Limonciello wrote:
>>>> From: Mario Limonciello <mario.limonciello@amd.com>
>>>>
>>>> By storing perf values in a union all the writes and reads can
>>>> be done atomically, removing the need for some concurrency protections.
>>>>
>>>> While making this change, also drop the cached frequency values,
>>>> using inline helpers to calculate them on demand from perf value.
>>>>
>>>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>>>> ---
> [Snip]
>>>> static int amd_pstate_update_freq(struct cpufreq_policy *policy,
>>>> unsigned int target_freq, bool fast_switch)
>>>> {
>>>> struct cpufreq_freqs freqs;
>>>> - struct amd_cpudata *cpudata = policy->driver_data;
>>>> + struct amd_cpudata *cpudata;
>>>> + union perf_cached perf;
>>>> u8 des_perf;
>>>> amd_pstate_update_min_max_limit(policy);
>>>> + cpudata = policy->driver_data;
>>>
>>> Any specific reason why we moved this dereferencing after amd_pstate_update_min_max_limit() ?
>>
>> Closer to the first use.
>>
>>>
>>>> + perf = READ_ONCE(cpudata->perf);
>>>> +
>>>> freqs.old = policy->cur;
>>>> freqs.new = target_freq;
>>>> - des_perf = freq_to_perf(cpudata, target_freq);
>>>> + des_perf = freq_to_perf(perf, cpudata->nominal_freq, target_freq);
>>>
>>> Personally I preferred the earlier 2 argument format for the helper functions, as the helper
>>> function handled the common dereferencing part, (i.e. cpudata->perf and cpudata->nominal_freq)
>>
>> Something like this?
>>
>> static inline u8 freq_to_perf(struct amd_cpudata *cpudata, unsigned int freq_val)
>> {
>> union perf_cached perf = READ_ONCE(cpudata->perf);
>> u8 perf_val = DIV_ROUND_UP_ULL((u64)freq_val * perf.nominal_perf, cpudata->nominal_freq);
>>
>> return clamp_t(u8, perf_val, perf.lowest_perf, perf.highest_perf);
>> }
>>
>> As an example in practice of what that turns into with inline code it should be:
>>
>> static void amd_pstate_update_min_max_limit(struct cpufreq_policy *policy)
>> {
>> struct amd_cpudata *cpudata = policy->driver_data;
>> union perf_cached perf = READ_ONCE(cpudata->perf);
>> union perf_cached perf2 = READ_ONCE(cpudata->perf);
>> union perf_cached perf3 = READ_ONCE(cpudata->perf);
>> u8 val1 = DIV_ROUND_UP_ULL((u64)policy->max * perf2.nominal_perf, cpudata->nominal_freq);
>> u8 val2 = DIV_ROUND_UP_ULL((u64)policy->min * perf2.nominal_perf, cpudata->nominal_freq);
>>
>> perf.max_limit_perf = clamp_t(u8, val1, perf2.lowest_perf, perf2.highest_perf);
>> perf.min_limit_perf = clamp_t(u8, val2, perf3.lowest_perf, perf3.highest_perf);
>> .
>> .
>> .
>>
>> So now that's 3 reads for cpudata->perf in every use.
>
> Yea, right, its a tradeoff, in clean looking code vs less computations.
> I'll leave it upto you, I'm okay either way.
>
OK - I think I'll leave it like it is now for the next spin, and let
Gautham be the tie breaker when he reviews it if he doesn't like it.
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 04/14] cpufreq/amd-pstate: Overhaul locking
2025-02-12 5:15 ` Dhananjay Ugwekar
@ 2025-02-12 22:05 ` Mario Limonciello
0 siblings, 0 replies; 41+ messages in thread
From: Mario Limonciello @ 2025-02-12 22:05 UTC (permalink / raw)
To: Dhananjay Ugwekar, Gautham R . Shenoy, Perry Yuan
Cc: open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
On 2/11/2025 23:15, Dhananjay Ugwekar wrote:
> On 2/12/2025 3:24 AM, Mario Limonciello wrote:
>> On 2/10/2025 23:02, Dhananjay Ugwekar wrote:
>>> On 2/7/2025 3:26 AM, Mario Limonciello wrote:
>>>> From: Mario Limonciello <mario.limonciello@amd.com>
>>>>
>>>> amd_pstate_cpu_boost_update() and refresh_frequency_limits() both
>>>> update the policy state and have nothing to do with the amd-pstate
>>>> driver itself.
>>>>
>>>> A global "limits" lock doesn't make sense because each CPU can have
>>>> policies changed independently. Instead introduce locks into to the
>>>> cpudata structure and lock each CPU independently.
>>>>
>>>> The remaining "global" driver lock is used to ensure that only one
>>>> entity can change driver modes at a given time.
>>>>
>>>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>>>> ---
>>>> drivers/cpufreq/amd-pstate.c | 27 +++++++++++++++++----------
>>>> drivers/cpufreq/amd-pstate.h | 2 ++
>>>> 2 files changed, 19 insertions(+), 10 deletions(-)
>>>>
>>>> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
>>>> index 77bc6418731ee..dd230ed3b9579 100644
>>>> --- a/drivers/cpufreq/amd-pstate.c
>>>> +++ b/drivers/cpufreq/amd-pstate.c
>>>> @@ -196,7 +196,6 @@ static inline int get_mode_idx_from_str(const char *str, size_t size)
>>>> return -EINVAL;
>>>> }
>>>> -static DEFINE_MUTEX(amd_pstate_limits_lock);
>>>> static DEFINE_MUTEX(amd_pstate_driver_lock);
>>>> static u8 msr_get_epp(struct amd_cpudata *cpudata)
>>>> @@ -283,6 +282,8 @@ static int msr_set_epp(struct amd_cpudata *cpudata, u8 epp)
>>>> u64 value, prev;
>>>> int ret;
>>>> + lockdep_assert_held(&cpudata->lock);
>>>
>>> After making the perf_cached variable writes atomic, do we still need a cpudata->lock ?
>>
>> My concern was specifically that userspace could interact with multiple sysfs files that influence the atomic perf variable (and the HW) at the same time. So you would not have a deterministic behavior if they raced. But if you take the mutex on all the paths that this could happen it will be a FIFO.
>
> I guess, the lock still wont guarantee the ordering right? It will just ensure that one thread executes
> that code path for a specific CPU at a time. And do we even care about the ordering ? I'm having a hard
> time thinking of a scenario where we'll need the lock. Can you or Gautham think of any such scenario?
>
You're right; I can't really think of one either. Let me take out the
private lock for the per-cpu device and I'll just overhaul the global
lock locations.
>>
>>>
>>> Regards,
>>> Dhananjay
>>>
>>>> +
>>>> value = prev = READ_ONCE(cpudata->cppc_req_cached);
>>>> value &= ~AMD_CPPC_EPP_PERF_MASK;
>>>> value |= FIELD_PREP(AMD_CPPC_EPP_PERF_MASK, epp);
>>>> @@ -315,6 +316,8 @@ static int shmem_set_epp(struct amd_cpudata *cpudata, u8 epp)
>>>> int ret;
>>>> struct cppc_perf_ctrls perf_ctrls;
>>>> + lockdep_assert_held(&cpudata->lock);
>>>> +
>>>> if (epp == cpudata->epp_cached)
>>>> return 0;
>>>> @@ -335,6 +338,8 @@ static int amd_pstate_set_energy_pref_index(struct cpufreq_policy *policy,
>>>> struct amd_cpudata *cpudata = policy->driver_data;
>>>> u8 epp;
>>>> + guard(mutex)(&cpudata->lock);
>>>> +
>>>> if (!pref_index)
>>>> epp = cpudata->epp_default;
>>>> else
>>>> @@ -750,7 +755,6 @@ static int amd_pstate_set_boost(struct cpufreq_policy *policy, int state)
>>>> pr_err("Boost mode is not supported by this processor or SBIOS\n");
>>>> return -EOPNOTSUPP;
>>>> }
>>>> - guard(mutex)(&amd_pstate_driver_lock);
>>>> ret = amd_pstate_cpu_boost_update(policy, state);
>>>> refresh_frequency_limits(policy);
>>>> @@ -973,6 +977,9 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
>>>> cpudata->cpu = policy->cpu;
>>>> + mutex_init(&cpudata->lock);
>>>> + guard(mutex)(&cpudata->lock);
>>>> +
>>>> ret = amd_pstate_init_perf(cpudata);
>>>> if (ret)
>>>> goto free_cpudata1;
>>>> @@ -1179,8 +1186,6 @@ static ssize_t store_energy_performance_preference(
>>>> if (ret < 0)
>>>> return -EINVAL;
>>>> - guard(mutex)(&amd_pstate_limits_lock);
>>>> -
>>>> ret = amd_pstate_set_energy_pref_index(policy, ret);
>>>> return ret ? ret : count;
>>>> @@ -1353,8 +1358,10 @@ int amd_pstate_update_status(const char *buf, size_t size)
>>>> if (mode_idx < 0 || mode_idx >= AMD_PSTATE_MAX)
>>>> return -EINVAL;
>>>> - if (mode_state_machine[cppc_state][mode_idx])
>>>> + if (mode_state_machine[cppc_state][mode_idx]) {
>>>> + guard(mutex)(&amd_pstate_driver_lock);
>>>> return mode_state_machine[cppc_state][mode_idx](mode_idx);
>>>> + }
>>>> return 0;
>>>> }
>>>> @@ -1375,7 +1382,6 @@ static ssize_t status_store(struct device *a, struct device_attribute *b,
>>>> char *p = memchr(buf, '\n', count);
>>>> int ret;
>>>> - guard(mutex)(&amd_pstate_driver_lock);
>>>> ret = amd_pstate_update_status(buf, p ? p - buf : count);
>>>> return ret < 0 ? ret : count;
>>>> @@ -1472,6 +1478,9 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
>>>> cpudata->cpu = policy->cpu;
>>>> + mutex_init(&cpudata->lock);
>>>> + guard(mutex)(&cpudata->lock);
>>>> +
>>>> ret = amd_pstate_init_perf(cpudata);
>>>> if (ret)
>>>> goto free_cpudata1;
>>>> @@ -1558,6 +1567,8 @@ static int amd_pstate_epp_update_limit(struct cpufreq_policy *policy)
>>>> union perf_cached perf;
>>>> u8 epp;
>>>> + guard(mutex)(&cpudata->lock);
>>>> +
>>>> amd_pstate_update_min_max_limit(policy);
>>>> if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
>>>> @@ -1646,8 +1657,6 @@ static int amd_pstate_epp_cpu_offline(struct cpufreq_policy *policy)
>>>> if (cpudata->suspended)
>>>> return 0;
>>>> - guard(mutex)(&amd_pstate_limits_lock);
>>>> -
>>>> if (trace_amd_pstate_epp_perf_enabled()) {
>>>> trace_amd_pstate_epp_perf(cpudata->cpu, perf.highest_perf,
>>>> AMD_CPPC_EPP_BALANCE_POWERSAVE,
>>>> @@ -1684,8 +1693,6 @@ static int amd_pstate_epp_resume(struct cpufreq_policy *policy)
>>>> struct amd_cpudata *cpudata = policy->driver_data;
>>>> if (cpudata->suspended) {
>>>> - guard(mutex)(&amd_pstate_limits_lock);
>>>> -
>>>> /* enable amd pstate from suspend state*/
>>>> amd_pstate_epp_reenable(policy);
>>>> diff --git a/drivers/cpufreq/amd-pstate.h b/drivers/cpufreq/amd-pstate.h
>>>> index a140704b97430..6d776c3e5712a 100644
>>>> --- a/drivers/cpufreq/amd-pstate.h
>>>> +++ b/drivers/cpufreq/amd-pstate.h
>>>> @@ -96,6 +96,8 @@ struct amd_cpudata {
>>>> bool boost_supported;
>>>> bool hw_prefcore;
>>>> + struct mutex lock;
>>>> +
>>>> /* EPP feature related attributes*/
>>>> u8 epp_cached;
>>>> u32 policy;
>>>
>>
>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 13/14] cpufreq/amd-pstate: Rework CPPC enabling
2025-02-06 21:56 ` [PATCH 13/14] cpufreq/amd-pstate: Rework CPPC enabling Mario Limonciello
@ 2025-02-13 4:42 ` Dhananjay Ugwekar
0 siblings, 0 replies; 41+ messages in thread
From: Dhananjay Ugwekar @ 2025-02-13 4:42 UTC (permalink / raw)
To: Mario Limonciello, Gautham R . Shenoy, Perry Yuan
Cc: open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
open list:CPU FREQUENCY SCALING FRAMEWORK, Mario Limonciello
On 2/7/2025 3:26 AM, Mario Limonciello wrote:
> From: Mario Limonciello <mario.limonciello@amd.com>
>
> The CPPC enable register is configured as "write once". That is
> any future writes don't actually do anything.
>
> Because of this, all the cleanup paths that currently exist for
> CPPC disable are non-effective.
>
> Rework CPPC enable to only enable after all the CAP registers have
> been read to avoid enabling CPPC on CPUs with invalid _CPC or
> unpopulated MSRs.
>
> As the register is write once, remove all cleanup paths as well.
>
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> ---
> drivers/cpufreq/amd-pstate.c | 188 +++++++++++------------------------
> 1 file changed, 59 insertions(+), 129 deletions(-)
>
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index 5945b6c7f7e56..697fa1b80cf24 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -85,7 +85,6 @@ static struct cpufreq_driver *current_pstate_driver;
> static struct cpufreq_driver amd_pstate_driver;
> static struct cpufreq_driver amd_pstate_epp_driver;
> static int cppc_state = AMD_PSTATE_UNDEFINED;
> -static bool cppc_enabled;
> static bool amd_pstate_prefcore = true;
> static struct quirk_entry *quirks;
>
> @@ -375,91 +374,40 @@ static int shmem_set_epp(struct cpufreq_policy *policy, u8 epp)
> return ret;
> }
>
> -static int amd_pstate_set_energy_pref_index(struct cpufreq_policy *policy,
> - int pref_index)
> +static inline int msr_cppc_enable(struct cpufreq_policy *policy)
> {
> struct amd_cpudata *cpudata = policy->driver_data;
Cant we just use "policy->cpu" in the return statement and avoid this deref?
> - u8 epp;
> -
> - guard(mutex)(&cpudata->lock);
>
> - if (!pref_index)
> - epp = cpudata->epp_default;
> - else
> - epp = epp_values[pref_index];
> -
> - if (epp > 0 && policy->policy == CPUFREQ_POLICY_PERFORMANCE) {
> - pr_debug("EPP cannot be set under performance policy\n");
> - return -EBUSY;
> - }
> -
> - return amd_pstate_set_epp(policy, epp);
> + return wrmsrl_safe_on_cpu(cpudata->cpu, MSR_AMD_CPPC_ENABLE, 1);
> }
>
> -static inline int msr_cppc_enable(bool enable)
> +static int shmem_cppc_enable(struct cpufreq_policy *policy)
> {
> - int ret, cpu;
> - unsigned long logical_proc_id_mask = 0;
> -
> - /*
> - * MSR_AMD_CPPC_ENABLE is write-once, once set it cannot be cleared.
> - */
> - if (!enable)
> - return 0;
> -
> - if (enable == cppc_enabled)
> - return 0;
> -
> - for_each_present_cpu(cpu) {
> - unsigned long logical_id = topology_logical_package_id(cpu);
> -
> - if (test_bit(logical_id, &logical_proc_id_mask))
> - continue;
> -
> - set_bit(logical_id, &logical_proc_id_mask);
> -
> - ret = wrmsrl_safe_on_cpu(cpu, MSR_AMD_CPPC_ENABLE,
> - enable);
> - if (ret)
> - return ret;
> - }
> -
> - cppc_enabled = enable;
> - return 0;
> -}
> -
> -static int shmem_cppc_enable(bool enable)
> -{
> - int cpu, ret = 0;
> + struct amd_cpudata *cpudata = policy->driver_data;
Similarly this deref also we can skip if we use "policy->cpu" in this function
> struct cppc_perf_ctrls perf_ctrls;
> + int ret;
>
> - if (enable == cppc_enabled)
> - return 0;
> + ret = cppc_set_enable(cpudata->cpu, 1);
> + if (ret)
> + return ret;
>
> - for_each_present_cpu(cpu) {
> - ret = cppc_set_enable(cpu, enable);
> + /* Enable autonomous mode for EPP */
> + if (cppc_state == AMD_PSTATE_ACTIVE) {
> + /* Set desired perf as zero to allow EPP firmware control */
> + perf_ctrls.desired_perf = 0;
> + ret = cppc_set_perf(cpudata->cpu, &perf_ctrls);
> if (ret)
> return ret;
> -
> - /* Enable autonomous mode for EPP */
> - if (cppc_state == AMD_PSTATE_ACTIVE) {
> - /* Set desired perf as zero to allow EPP firmware control */
> - perf_ctrls.desired_perf = 0;
> - ret = cppc_set_perf(cpu, &perf_ctrls);
> - if (ret)
> - return ret;
> - }
> }
>
> - cppc_enabled = enable;
> return ret;
> }
>
> DEFINE_STATIC_CALL(amd_pstate_cppc_enable, msr_cppc_enable);
>
> -static inline int amd_pstate_cppc_enable(bool enable)
> +static inline int amd_pstate_cppc_enable(struct cpufreq_policy *policy)
> {
> - return static_call(amd_pstate_cppc_enable)(enable);
> + return static_call(amd_pstate_cppc_enable)(policy);
> }
>
> static int msr_init_perf(struct amd_cpudata *cpudata)
> @@ -1122,24 +1070,7 @@ static void amd_pstate_cpu_exit(struct cpufreq_policy *policy)
>
> static int amd_pstate_cpu_resume(struct cpufreq_policy *policy)
> {
> - int ret;
> -
> - ret = amd_pstate_cppc_enable(true);
> - if (ret)
> - pr_err("failed to enable amd-pstate during resume, return %d\n", ret);
> -
> - return ret;
> -}
> -
> -static int amd_pstate_cpu_suspend(struct cpufreq_policy *policy)
> -{
> - int ret;
> -
> - ret = amd_pstate_cppc_enable(false);
> - if (ret)
> - pr_err("failed to disable amd-pstate during suspend, return %d\n", ret);
> -
> - return ret;
> + return amd_pstate_cppc_enable(policy);
We can get rid of this function right?, as we will always enable the CPPC in the init path
and never disable it after that.
> }
>
> /* Sysfs attributes */
> @@ -1241,8 +1172,10 @@ static ssize_t show_energy_performance_available_preferences(
> static ssize_t store_energy_performance_preference(
> struct cpufreq_policy *policy, const char *buf, size_t count)
> {
> + struct amd_cpudata *cpudata = policy->driver_data;
> char str_preference[21];
> ssize_t ret;
> + u8 epp;
>
> ret = sscanf(buf, "%20s", str_preference);
> if (ret != 1)
> @@ -1252,7 +1185,31 @@ static ssize_t store_energy_performance_preference(
> if (ret < 0)
> return -EINVAL;
>
> - ret = amd_pstate_set_energy_pref_index(policy, ret);
> + if (!ret)
> + epp = cpudata->epp_default;
> + else
> + epp = epp_values[ret];
> +
> + if (epp > 0 && policy->policy == CPUFREQ_POLICY_PERFORMANCE) {
> + pr_debug("EPP cannot be set under performance policy\n");
> + return -EBUSY;
> + }
> +
> + if (trace_amd_pstate_epp_perf_enabled()) {
> + union perf_cached perf = cpudata->perf;
> +
> + trace_amd_pstate_epp_perf(cpudata->cpu, perf.highest_perf,
> + epp,
> + FIELD_GET(AMD_CPPC_MIN_PERF_MASK, cpudata->cppc_req_cached),
> + FIELD_GET(AMD_CPPC_MAX_PERF_MASK, cpudata->cppc_req_cached),
> + policy->boost_enabled,
> + FIELD_GET(AMD_CPPC_EPP_PERF_MASK,
> + cpudata->cppc_req_cached) != epp);
We've moved the tracing part to the set_epp and update_perf functions right?
Do we need it here? I see a set_epp() call just below
> + }
> +
> + guard(mutex)(&cpudata->lock);
> +
> + ret = amd_pstate_set_epp(policy, epp);
>
> return ret ? ret : count;
> }
> @@ -1285,7 +1242,6 @@ static ssize_t show_energy_performance_preference(
>
> static void amd_pstate_driver_cleanup(void)
> {
> - amd_pstate_cppc_enable(false);
> cppc_state = AMD_PSTATE_DISABLE;
> current_pstate_driver = NULL;
> }
> @@ -1319,14 +1275,6 @@ static int amd_pstate_register_driver(int mode)
>
> cppc_state = mode;
>
> - ret = amd_pstate_cppc_enable(true);
> - if (ret) {
> - pr_err("failed to enable cppc during amd-pstate driver registration, return %d\n",
> - ret);
> - amd_pstate_driver_cleanup();
> - return ret;
> - }
> -
> /* at least one CPU supports CPB */
> current_pstate_driver->boost_enabled = cpu_feature_enabled(X86_FEATURE_CPB);
>
> @@ -1570,11 +1518,15 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
> policy->cpuinfo.max_freq = policy->max = perf_to_freq(perf,
> cpudata->nominal_freq,
> perf.highest_perf);
> + policy->driver_data = cpudata;
> +
> + ret = amd_pstate_cppc_enable(policy);
This will get called for each CPU even though it is a systemwide register :(
Is it possible to add a basic sanity check for "invalid _CPC or unpopulated MSRs" in amd_pstate_init()?
Because I think the current design is quite good, i.e. 2 paths which enable cppc
1. amd_pstate_init()->amd_pstate_register_driver()->amd_pstate_cppc_enable() [The normal case, where kernel is booted with "amd_pstate=<x>"]
2. mode_state_machine[disabled][active/guided/passive]()->amd_pstate_register_driver()->amd_pstate_cppc_enable() [kernel is booted with
acpi_cpufreq then someone enables amd_pstate by writing to the "amd_pstate/status" file]
> + if (ret)
> + goto free_cpudata1;
>
> /* It will be updated by governor */
> policy->cur = policy->cpuinfo.min_freq;
>
> - policy->driver_data = cpudata;
>
> policy->boost_enabled = READ_ONCE(cpudata->boost_supported);
>
> @@ -1667,34 +1619,28 @@ static int amd_pstate_epp_set_policy(struct cpufreq_policy *policy)
> return 0;
> }
>
> -static int amd_pstate_epp_reenable(struct cpufreq_policy *policy)
> +static int amd_pstate_epp_cpu_online(struct cpufreq_policy *policy)
> {
> struct amd_cpudata *cpudata = policy->driver_data;
> union perf_cached perf = cpudata->perf;
> int ret;
>
> - ret = amd_pstate_cppc_enable(true);
> + pr_debug("AMD CPU Core %d going online\n", cpudata->cpu);
> +
> + ret = amd_pstate_cppc_enable(policy);
> if (ret)
> - pr_err("failed to enable amd pstate during resume, return %d\n", ret);
> + return ret;
>
> guard(mutex)(&cpudata->lock);
>
> - return amd_pstate_update_perf(policy, 0, 0, perf.highest_perf, cpudata->epp_cached, false);
> -}
> -
> -static int amd_pstate_epp_cpu_online(struct cpufreq_policy *policy)
> -{
> - struct amd_cpudata *cpudata = policy->driver_data;
> - int ret;
> -
> - pr_debug("AMD CPU Core %d going online\n", cpudata->cpu);
> -
> - ret = amd_pstate_epp_reenable(policy);
> + ret = amd_pstate_update_perf(policy, 0, 0, perf.highest_perf, cpudata->epp_cached, false);
> if (ret)
> return ret;
> +
> cpudata->suspended = false;
>
> return 0;
> +
> }
>
> static int amd_pstate_epp_cpu_offline(struct cpufreq_policy *policy)
> @@ -1714,20 +1660,10 @@ static int amd_pstate_epp_cpu_offline(struct cpufreq_policy *policy)
> static int amd_pstate_epp_suspend(struct cpufreq_policy *policy)
> {
> struct amd_cpudata *cpudata = policy->driver_data;
> - int ret;
> -
> - /* avoid suspending when EPP is not enabled */
> - if (cppc_state != AMD_PSTATE_ACTIVE)
> - return 0;
>
> /* set this flag to avoid setting core offline*/
> cpudata->suspended = true;
>
> - /* disable CPPC in lowlevel firmware */
> - ret = amd_pstate_cppc_enable(false);
> - if (ret)
> - pr_err("failed to suspend, return %d\n", ret);
> -
> return 0;
> }
>
> @@ -1735,12 +1671,8 @@ static int amd_pstate_epp_resume(struct cpufreq_policy *policy)
> {
> struct amd_cpudata *cpudata = policy->driver_data;
>
> - if (cpudata->suspended) {
> - /* enable amd pstate from suspend state*/
> - amd_pstate_epp_reenable(policy);
>
> - cpudata->suspended = false;
> - }
> + cpudata->suspended = false;
>
> return 0;
> }
> @@ -1752,7 +1684,6 @@ static struct cpufreq_driver amd_pstate_driver = {
> .fast_switch = amd_pstate_fast_switch,
> .init = amd_pstate_cpu_init,
> .exit = amd_pstate_cpu_exit,
> - .suspend = amd_pstate_cpu_suspend,
> .resume = amd_pstate_cpu_resume,
> .set_boost = amd_pstate_set_boost,
> .update_limits = amd_pstate_update_limits,
> @@ -1768,8 +1699,8 @@ static struct cpufreq_driver amd_pstate_epp_driver = {
> .exit = amd_pstate_epp_cpu_exit,
> .offline = amd_pstate_epp_cpu_offline,
> .online = amd_pstate_epp_cpu_online,
> - .suspend = amd_pstate_epp_suspend,
> - .resume = amd_pstate_epp_resume,
> + .suspend = amd_pstate_epp_suspend,
> + .resume = amd_pstate_epp_resume,
Some spurious whitespace change?
> .update_limits = amd_pstate_update_limits,
> .set_boost = amd_pstate_set_boost,
> .name = "amd-pstate-epp",
> @@ -1920,7 +1851,6 @@ static int __init amd_pstate_init(void)
>
> global_attr_free:
> cpufreq_unregister_driver(current_pstate_driver);
> - amd_pstate_cppc_enable(false);
> return ret;
> }
> device_initcall(amd_pstate_init);
^ permalink raw reply [flat|nested] 41+ messages in thread
end of thread, other threads:[~2025-02-13 4:42 UTC | newest]
Thread overview: 41+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-06 21:56 [PATCH 00/14] amd-pstate cleanups Mario Limonciello
2025-02-06 21:56 ` [PATCH 01/14] cpufreq/amd-pstate: Show a warning when a CPU fails to setup Mario Limonciello
2025-02-10 11:59 ` Dhananjay Ugwekar
2025-02-10 13:50 ` Gautham R. Shenoy
2025-02-10 15:13 ` Mario Limonciello
2025-02-06 21:56 ` [PATCH 02/14] cpufreq/amd-pstate: Drop min and max cached frequencies Mario Limonciello
2025-02-07 10:44 ` Dhananjay Ugwekar
2025-02-07 16:15 ` Mario Limonciello
2025-02-06 21:56 ` [PATCH 03/14] cpufreq/amd-pstate: Move perf values into a union Mario Limonciello
2025-02-10 13:38 ` Dhananjay Ugwekar
2025-02-11 22:14 ` Mario Limonciello
2025-02-12 6:31 ` Dhananjay Ugwekar
2025-02-12 22:03 ` Mario Limonciello
2025-02-06 21:56 ` [PATCH 04/14] cpufreq/amd-pstate: Overhaul locking Mario Limonciello
2025-02-11 5:02 ` Dhananjay Ugwekar
2025-02-11 21:54 ` Mario Limonciello
2025-02-12 5:15 ` Dhananjay Ugwekar
2025-02-12 22:05 ` Mario Limonciello
2025-02-06 21:56 ` [PATCH 05/14] cpufreq/amd-pstate: Drop `cppc_cap1_cached` Mario Limonciello
2025-02-11 5:46 ` Dhananjay Ugwekar
2025-02-06 21:56 ` [PATCH 06/14] cpufreq/amd-pstate-ut: Use _free macro to free put policy Mario Limonciello
2025-02-11 5:58 ` Dhananjay Ugwekar
2025-02-06 21:56 ` [PATCH 07/14] cpufreq/amd-pstate: Replace all AMD_CPPC_* macros with masks Mario Limonciello
2025-02-11 6:16 ` Dhananjay Ugwekar
2025-02-11 18:31 ` Mario Limonciello
2025-02-06 21:56 ` [PATCH 08/14] cpufreq/amd-pstate: Cache CPPC request in shared mem case too Mario Limonciello
2025-02-11 9:18 ` Dhananjay Ugwekar
2025-02-06 21:56 ` [PATCH 09/14] cpufreq/amd-pstate: Move all EPP tracing into *_update_perf and *_set_epp functions Mario Limonciello
2025-02-12 6:39 ` Dhananjay Ugwekar
2025-02-06 21:56 ` [PATCH 10/14] cpufreq/amd-pstate: Update cppc_req_cached for shared mem EPP writes Mario Limonciello
2025-02-11 13:01 ` Dhananjay Ugwekar
2025-02-06 21:56 ` [PATCH 11/14] cpufreq/amd-pstate: Drop debug statements for policy setting Mario Limonciello
2025-02-11 13:03 ` Dhananjay Ugwekar
2025-02-06 21:56 ` [PATCH 12/14] cpufreq/amd-pstate: Cache a pointer to policy in cpudata Mario Limonciello
2025-02-11 13:13 ` Dhananjay Ugwekar
2025-02-11 19:17 ` Mario Limonciello
2025-02-12 3:52 ` Dhananjay Ugwekar
2025-02-06 21:56 ` [PATCH 13/14] cpufreq/amd-pstate: Rework CPPC enabling Mario Limonciello
2025-02-13 4:42 ` Dhananjay Ugwekar
2025-02-06 21:56 ` [PATCH 14/14] cpufreq/amd-pstate: Stop caching EPP Mario Limonciello
2025-02-11 13:27 ` Dhananjay Ugwekar
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).