* [PATCH V4 1/7] x86: Drop CPU_SUP_INTEL from SCHED_MC_PRIO for the expansion.
2023-08-29 6:43 [PATCH V4 0/7] AMD Pstate Preferred Core Meng Li
@ 2023-08-29 6:43 ` Meng Li
2023-08-29 6:43 ` [PATCH V4 2/7] acpi: cppc: Add get the highest performance cppc control Meng Li
` (5 subsequent siblings)
6 siblings, 0 replies; 13+ messages in thread
From: Meng Li @ 2023-08-29 6:43 UTC (permalink / raw)
To: Rafael J . Wysocki, Huang Rui
Cc: linux-pm, linux-kernel, x86, linux-acpi, Shuah Khan,
linux-kselftest, Nathan Fontenot, Deepak Sharma, Alex Deucher,
Mario Limonciello, Shimmer Huang, Perry Yuan, Xiaojian Du,
Viresh Kumar, Borislav Petkov, Meng Li
AMD Pstate driver also uses SCHED_MC_PRIO, so decouple the requirement
of CPU_SUP_INTEL from the dependencies to allow compilation in kernels
without Intel CPU support.
Signed-off-by: Meng Li <li.meng@amd.com>
---
arch/x86/Kconfig | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index e36261b4ea14..16df141bd8a2 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1052,8 +1052,9 @@ config SCHED_MC
config SCHED_MC_PRIO
bool "CPU core priorities scheduler support"
- depends on SCHED_MC && CPU_SUP_INTEL
- select X86_INTEL_PSTATE
+ depends on SCHED_MC
+ select X86_INTEL_PSTATE if CPU_SUP_INTEL
+ select X86_AMD_PSTATE if CPU_SUP_AMD
select CPU_FREQ
default y
help
--
2.34.1
^ permalink raw reply related [flat|nested] 13+ messages in thread* [PATCH V4 2/7] acpi: cppc: Add get the highest performance cppc control
2023-08-29 6:43 [PATCH V4 0/7] AMD Pstate Preferred Core Meng Li
2023-08-29 6:43 ` [PATCH V4 1/7] x86: Drop CPU_SUP_INTEL from SCHED_MC_PRIO for the expansion Meng Li
@ 2023-08-29 6:43 ` Meng Li
2023-08-29 6:43 ` [PATCH V4 3/7] cpufreq: amd-pstate: Enable AMD Pstate Preferred Core Supporting Meng Li
` (4 subsequent siblings)
6 siblings, 0 replies; 13+ messages in thread
From: Meng Li @ 2023-08-29 6:43 UTC (permalink / raw)
To: Rafael J . Wysocki, Huang Rui
Cc: linux-pm, linux-kernel, x86, linux-acpi, Shuah Khan,
linux-kselftest, Nathan Fontenot, Deepak Sharma, Alex Deucher,
Mario Limonciello, Shimmer Huang, Perry Yuan, Xiaojian Du,
Viresh Kumar, Borislav Petkov, Meng Li, Wyes Karny
Add support for getting the highest performance to the
generic CPPC driver. This enables downstream drivers
such as amd-pstate to discover and use these values.
Please refer to the ACPI_Spec for details on continuous
performance control of CPPC.
Signed-off-by: Meng Li <li.meng@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Wyes Karny <wyes.karny@amd.com>
Link: https://uefi.org/specs/ACPI/6.5/08_Processor_Configuration_and_Control.html?highlight=cppc#cpc-continuous-performance-control
---
drivers/acpi/cppc_acpi.c | 13 +++++++++++++
include/acpi/cppc_acpi.h | 5 +++++
2 files changed, 18 insertions(+)
diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
index 7ff269a78c20..ad388a0e8484 100644
--- a/drivers/acpi/cppc_acpi.c
+++ b/drivers/acpi/cppc_acpi.c
@@ -1154,6 +1154,19 @@ int cppc_get_nominal_perf(int cpunum, u64 *nominal_perf)
return cppc_get_perf(cpunum, NOMINAL_PERF, nominal_perf);
}
+/**
+ * cppc_get_highest_perf - Get the highest performance register value.
+ * @cpunum: CPU from which to get highest performance.
+ * @highest_perf: Return address.
+ *
+ * Return: 0 for success, -EIO otherwise.
+ */
+int cppc_get_highest_perf(int cpunum, u64 *highest_perf)
+{
+ return cppc_get_perf(cpunum, HIGHEST_PERF, highest_perf);
+}
+EXPORT_SYMBOL_GPL(cppc_get_highest_perf);
+
/**
* cppc_get_epp_perf - Get the epp register value.
* @cpunum: CPU from which to get epp preference value.
diff --git a/include/acpi/cppc_acpi.h b/include/acpi/cppc_acpi.h
index 6126c977ece0..c0b69ffe7bdb 100644
--- a/include/acpi/cppc_acpi.h
+++ b/include/acpi/cppc_acpi.h
@@ -139,6 +139,7 @@ struct cppc_cpudata {
#ifdef CONFIG_ACPI_CPPC_LIB
extern int cppc_get_desired_perf(int cpunum, u64 *desired_perf);
extern int cppc_get_nominal_perf(int cpunum, u64 *nominal_perf);
+extern int cppc_get_highest_perf(int cpunum, u64 *highest_perf);
extern int cppc_get_perf_ctrs(int cpu, struct cppc_perf_fb_ctrs *perf_fb_ctrs);
extern int cppc_set_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls);
extern int cppc_set_enable(int cpu, bool enable);
@@ -165,6 +166,10 @@ static inline int cppc_get_nominal_perf(int cpunum, u64 *nominal_perf)
{
return -ENOTSUPP;
}
+static inline int cppc_get_highest_perf(int cpunum, u64 *highest_perf)
+{
+ return -ENOTSUPP;
+}
static inline int cppc_get_perf_ctrs(int cpu, struct cppc_perf_fb_ctrs *perf_fb_ctrs)
{
return -ENOTSUPP;
--
2.34.1
^ permalink raw reply related [flat|nested] 13+ messages in thread* [PATCH V4 3/7] cpufreq: amd-pstate: Enable AMD Pstate Preferred Core Supporting.
2023-08-29 6:43 [PATCH V4 0/7] AMD Pstate Preferred Core Meng Li
2023-08-29 6:43 ` [PATCH V4 1/7] x86: Drop CPU_SUP_INTEL from SCHED_MC_PRIO for the expansion Meng Li
2023-08-29 6:43 ` [PATCH V4 2/7] acpi: cppc: Add get the highest performance cppc control Meng Li
@ 2023-08-29 6:43 ` Meng Li
2023-08-29 8:01 ` Huang Rui
2023-08-29 14:52 ` kernel test robot
2023-08-29 6:43 ` [PATCH V4 4/7] cpufreq: Add a notification message that the highest perf has changed Meng Li
` (3 subsequent siblings)
6 siblings, 2 replies; 13+ messages in thread
From: Meng Li @ 2023-08-29 6:43 UTC (permalink / raw)
To: Rafael J . Wysocki, Huang Rui
Cc: linux-pm, linux-kernel, x86, linux-acpi, Shuah Khan,
linux-kselftest, Nathan Fontenot, Deepak Sharma, Alex Deucher,
Mario Limonciello, Shimmer Huang, Perry Yuan, Xiaojian Du,
Viresh Kumar, Borislav Petkov, Meng Li
AMD Pstate driver utilizes the functions and data structures
provided by the ITMT architecture to enable the scheduler to
favor scheduling on cores which can be get a higher frequency
with lower voltage. We call it AMD Pstate Preferrred Core.
Here sched_set_itmt_core_prio() is called to set priorities and
sched_set_itmt_support() is called to enable ITMT feature.
AMD Pstate driver uses the highest performance value to indicate
the priority of CPU. The higher value has a higher priority.
The initial core rankings are set up by AMD Pstate when the
system boots.
Add device attribute for preferred core states.
Add one new early parameter `enable` to allow user to
enable the preferred core if the processor and power
firmware can support preferred core feature.
Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
Co-developed-by: Perry Yuan <Perry.Yuan@amd.com>
Signed-off-by: Meng Li <li.meng@amd.com>
Co-developed-by: Meng Li <li.meng@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
---
drivers/cpufreq/amd-pstate.c | 120 ++++++++++++++++++++++++++++++-----
1 file changed, 104 insertions(+), 16 deletions(-)
diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 9a1e194d5cf8..d02305675f66 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -37,6 +37,7 @@
#include <linux/uaccess.h>
#include <linux/static_call.h>
#include <linux/amd-pstate.h>
+#include <linux/topology.h>
#include <acpi/processor.h>
#include <acpi/cppc_acpi.h>
@@ -49,6 +50,8 @@
#define AMD_PSTATE_TRANSITION_LATENCY 20000
#define AMD_PSTATE_TRANSITION_DELAY 1000
+#define AMD_PSTATE_PREFCORE_THRESHOLD 166
+#define AMD_PSTATE_MAX_CPPC_PERF 255
/*
* TODO: We need more time to fine tune processors with shared memory solution
@@ -65,6 +68,9 @@ static struct cpufreq_driver amd_pstate_epp_driver;
static int cppc_state = AMD_PSTATE_UNDEFINED;
static bool cppc_enabled;
+/*Preferred Core featue is supported*/
+static bool prefcore = true;
+
/*
* AMD Energy Preference Performance (EPP)
* The EPP is used in the CCLK DPM controller to drive
@@ -290,23 +296,21 @@ static inline int amd_pstate_enable(bool enable)
static int pstate_init_perf(struct amd_cpudata *cpudata)
{
u64 cap1;
- u32 highest_perf;
int ret = rdmsrl_safe_on_cpu(cpudata->cpu, MSR_AMD_CPPC_CAP1,
&cap1);
if (ret)
return ret;
- /*
- * TODO: Introduce AMD specific power feature.
- *
- * CPPC entry doesn't indicate the highest performance in some ASICs.
+ /* For platforms that do not support the preferred core feature, the
+ * highest_pef may be configured with 166 or 255, to avoid max frequency
+ * calculated wrongly. we take the AMD_CPPC_HIGHEST_PERF(cap1) value as
+ * the default max perf.
*/
- highest_perf = amd_get_highest_perf();
- if (highest_perf > AMD_CPPC_HIGHEST_PERF(cap1))
- highest_perf = AMD_CPPC_HIGHEST_PERF(cap1);
-
- WRITE_ONCE(cpudata->highest_perf, highest_perf);
+ if (prefcore)
+ WRITE_ONCE(cpudata->highest_perf, AMD_PSTATE_PREFCORE_THRESHOLD);
+ else
+ WRITE_ONCE(cpudata->highest_perf, AMD_CPPC_HIGHEST_PERF(cap1));
WRITE_ONCE(cpudata->nominal_perf, AMD_CPPC_NOMINAL_PERF(cap1));
WRITE_ONCE(cpudata->lowest_nonlinear_perf, AMD_CPPC_LOWNONLIN_PERF(cap1));
@@ -318,17 +322,15 @@ static int pstate_init_perf(struct amd_cpudata *cpudata)
static int cppc_init_perf(struct amd_cpudata *cpudata)
{
struct cppc_perf_caps cppc_perf;
- u32 highest_perf;
int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
if (ret)
return ret;
- highest_perf = amd_get_highest_perf();
- if (highest_perf > cppc_perf.highest_perf)
- highest_perf = cppc_perf.highest_perf;
-
- WRITE_ONCE(cpudata->highest_perf, highest_perf);
+ if (prefcore)
+ WRITE_ONCE(cpudata->highest_perf, AMD_PSTATE_PREFCORE_THRESHOLD);
+ else
+ WRITE_ONCE(cpudata->highest_perf, cppc_perf.highest_perf);
WRITE_ONCE(cpudata->nominal_perf, cppc_perf.nominal_perf);
WRITE_ONCE(cpudata->lowest_nonlinear_perf,
@@ -676,6 +678,72 @@ static void amd_perf_ctl_reset(unsigned int cpu)
wrmsrl_on_cpu(cpu, MSR_AMD_PERF_CTL, 0);
}
+/*
+ * Set AMD Pstate Preferred Core enable can't be done directly from cpufreq callbacks
+ * due to locking, so queue the work for later.
+ */
+static void amd_pstste_sched_prefcore_workfn(struct work_struct *work)
+{
+ sched_set_itmt_support();
+}
+static DECLARE_WORK(sched_prefcore_work, amd_pstste_sched_prefcore_workfn);
+
+/**
+ * Get the highest performance register value.
+ * @cpu: CPU from which to get highest performance.
+ * @highest_perf: Return address.
+ *
+ * Return: 0 for success, -EIO otherwise.
+ */
+static int amd_pstate_get_highest_perf(int cpu, u64 *highest_perf)
+{
+ int ret;
+
+ if (boot_cpu_has(X86_FEATURE_CPPC)) {
+ u64 cap1;
+
+ ret = rdmsrl_safe_on_cpu(cpu, MSR_AMD_CPPC_CAP1, &cap1);
+ if (ret)
+ return ret;
+ WRITE_ONCE(*highest_perf, AMD_CPPC_HIGHEST_PERF(cap1));
+ } else {
+ ret = cppc_get_highest_perf(cpu, highest_perf);
+ }
+
+ return (ret);
+}
+
+static void amd_pstate_init_prefcore(void)
+{
+ int cpu, ret;
+ u64 highest_perf;
+
+ if (!prefcore)
+ return;
+
+ for_each_online_cpu(cpu) {
+ ret = amd_pstate_get_highest_perf(cpu, &highest_perf);
+ if (ret)
+ break;
+
+ sched_set_itmt_core_prio(highest_perf, cpu);
+
+ /* check if CPPC preferred core feature is enabled*/
+ if (highest_perf == AMD_PSTATE_MAX_CPPC_PERF) {
+ prefcore = false;
+ return;
+ }
+ }
+
+ /*
+ * This code can be run during CPU online under the
+ * CPU hotplug locks, so sched_set_amd_prefcore_support()
+ * cannot be called from here. Queue up a work item
+ * to invoke it.
+ */
+ schedule_work(&sched_prefcore_work);
+}
+
static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
{
int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
@@ -1037,6 +1105,12 @@ static ssize_t status_store(struct device *a, struct device_attribute *b,
return ret < 0 ? ret : count;
}
+static ssize_t prefcore_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ return sysfs_emit(buf, "%s\n", prefcore ? "enabled" : "disabled");
+}
+
cpufreq_freq_attr_ro(amd_pstate_max_freq);
cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq);
@@ -1044,6 +1118,7 @@ cpufreq_freq_attr_ro(amd_pstate_highest_perf);
cpufreq_freq_attr_rw(energy_performance_preference);
cpufreq_freq_attr_ro(energy_performance_available_preferences);
static DEVICE_ATTR_RW(status);
+static DEVICE_ATTR_RO(prefcore);
static struct freq_attr *amd_pstate_attr[] = {
&amd_pstate_max_freq,
@@ -1063,6 +1138,7 @@ static struct freq_attr *amd_pstate_epp_attr[] = {
static struct attribute *pstate_global_attributes[] = {
&dev_attr_status.attr,
+ &dev_attr_prefcore.attr,
NULL
};
@@ -1506,6 +1582,8 @@ static int __init amd_pstate_init(void)
}
}
+ amd_pstate_init_prefcore();
+
return ret;
global_attr_free:
@@ -1527,7 +1605,17 @@ static int __init amd_pstate_param(char *str)
return amd_pstate_set_driver(mode_idx);
}
+
+static int __init amd_prefcore_param(char *str)
+{
+ if (!strcmp(str, "disable"))
+ prefcore = false;
+
+ return 0;
+}
+
early_param("amd_pstate", amd_pstate_param);
+early_param("amd_prefcore", amd_prefcore_param);
MODULE_AUTHOR("Huang Rui <ray.huang@amd.com>");
MODULE_DESCRIPTION("AMD Processor P-state Frequency Driver");
--
2.34.1
^ permalink raw reply related [flat|nested] 13+ messages in thread* Re: [PATCH V4 3/7] cpufreq: amd-pstate: Enable AMD Pstate Preferred Core Supporting.
2023-08-29 6:43 ` [PATCH V4 3/7] cpufreq: amd-pstate: Enable AMD Pstate Preferred Core Supporting Meng Li
@ 2023-08-29 8:01 ` Huang Rui
2023-08-31 1:42 ` Meng, Li (Jassmine)
2023-08-29 14:52 ` kernel test robot
1 sibling, 1 reply; 13+ messages in thread
From: Huang Rui @ 2023-08-29 8:01 UTC (permalink / raw)
To: Meng, Li (Jassmine)
Cc: Rafael J . Wysocki, linux-pm@vger.kernel.org,
linux-kernel@vger.kernel.org, x86@kernel.org,
linux-acpi@vger.kernel.org, Shuah Khan,
linux-kselftest@vger.kernel.org, Fontenot, Nathan, Sharma, Deepak,
Deucher, Alexander, Limonciello, Mario, Huang, Shimmer,
Yuan, Perry, Du, Xiaojian, Viresh Kumar, Borislav Petkov
On Tue, Aug 29, 2023 at 02:43:36PM +0800, Meng, Li (Jassmine) wrote:
> AMD Pstate driver utilizes the functions and data structures
> provided by the ITMT architecture to enable the scheduler to
> favor scheduling on cores which can be get a higher frequency
> with lower voltage. We call it AMD Pstate Preferrred Core.
>
> Here sched_set_itmt_core_prio() is called to set priorities and
> sched_set_itmt_support() is called to enable ITMT feature.
> AMD Pstate driver uses the highest performance value to indicate
> the priority of CPU. The higher value has a higher priority.
>
> The initial core rankings are set up by AMD Pstate when the
> system boots.
>
> Add device attribute for preferred core states.
>
> Add one new early parameter `enable` to allow user to
> enable the preferred core if the processor and power
> firmware can support preferred core feature.
>
> Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
> Co-developed-by: Perry Yuan <Perry.Yuan@amd.com>
> Signed-off-by: Meng Li <li.meng@amd.com>
> Co-developed-by: Meng Li <li.meng@amd.com>
> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
> ---
> drivers/cpufreq/amd-pstate.c | 120 ++++++++++++++++++++++++++++++-----
> 1 file changed, 104 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index 9a1e194d5cf8..d02305675f66 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -37,6 +37,7 @@
> #include <linux/uaccess.h>
> #include <linux/static_call.h>
> #include <linux/amd-pstate.h>
> +#include <linux/topology.h>
>
> #include <acpi/processor.h>
> #include <acpi/cppc_acpi.h>
> @@ -49,6 +50,8 @@
>
> #define AMD_PSTATE_TRANSITION_LATENCY 20000
> #define AMD_PSTATE_TRANSITION_DELAY 1000
> +#define AMD_PSTATE_PREFCORE_THRESHOLD 166
> +#define AMD_PSTATE_MAX_CPPC_PERF 255
>
> /*
> * TODO: We need more time to fine tune processors with shared memory solution
> @@ -65,6 +68,9 @@ static struct cpufreq_driver amd_pstate_epp_driver;
> static int cppc_state = AMD_PSTATE_UNDEFINED;
> static bool cppc_enabled;
>
> +/*Preferred Core featue is supported*/
> +static bool prefcore = true;
> +
> /*
> * AMD Energy Preference Performance (EPP)
> * The EPP is used in the CCLK DPM controller to drive
> @@ -290,23 +296,21 @@ static inline int amd_pstate_enable(bool enable)
> static int pstate_init_perf(struct amd_cpudata *cpudata)
> {
> u64 cap1;
> - u32 highest_perf;
>
> int ret = rdmsrl_safe_on_cpu(cpudata->cpu, MSR_AMD_CPPC_CAP1,
> &cap1);
> if (ret)
> return ret;
>
> - /*
> - * TODO: Introduce AMD specific power feature.
> - *
> - * CPPC entry doesn't indicate the highest performance in some ASICs.
> + /* For platforms that do not support the preferred core feature, the
> + * highest_pef may be configured with 166 or 255, to avoid max frequency
> + * calculated wrongly. we take the AMD_CPPC_HIGHEST_PERF(cap1) value as
> + * the default max perf.
> */
> - highest_perf = amd_get_highest_perf();
> - if (highest_perf > AMD_CPPC_HIGHEST_PERF(cap1))
> - highest_perf = AMD_CPPC_HIGHEST_PERF(cap1);
> -
> - WRITE_ONCE(cpudata->highest_perf, highest_perf);
> + if (prefcore)
> + WRITE_ONCE(cpudata->highest_perf, AMD_PSTATE_PREFCORE_THRESHOLD);
> + else
> + WRITE_ONCE(cpudata->highest_perf, AMD_CPPC_HIGHEST_PERF(cap1));
>
> WRITE_ONCE(cpudata->nominal_perf, AMD_CPPC_NOMINAL_PERF(cap1));
> WRITE_ONCE(cpudata->lowest_nonlinear_perf, AMD_CPPC_LOWNONLIN_PERF(cap1));
> @@ -318,17 +322,15 @@ static int pstate_init_perf(struct amd_cpudata *cpudata)
> static int cppc_init_perf(struct amd_cpudata *cpudata)
> {
> struct cppc_perf_caps cppc_perf;
> - u32 highest_perf;
>
> int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
> if (ret)
> return ret;
>
> - highest_perf = amd_get_highest_perf();
> - if (highest_perf > cppc_perf.highest_perf)
> - highest_perf = cppc_perf.highest_perf;
> -
> - WRITE_ONCE(cpudata->highest_perf, highest_perf);
> + if (prefcore)
> + WRITE_ONCE(cpudata->highest_perf, AMD_PSTATE_PREFCORE_THRESHOLD);
> + else
> + WRITE_ONCE(cpudata->highest_perf, cppc_perf.highest_perf);
>
> WRITE_ONCE(cpudata->nominal_perf, cppc_perf.nominal_perf);
> WRITE_ONCE(cpudata->lowest_nonlinear_perf,
> @@ -676,6 +678,72 @@ static void amd_perf_ctl_reset(unsigned int cpu)
> wrmsrl_on_cpu(cpu, MSR_AMD_PERF_CTL, 0);
> }
>
> +/*
> + * Set AMD Pstate Preferred Core enable can't be done directly from cpufreq callbacks
> + * due to locking, so queue the work for later.
> + */
> +static void amd_pstste_sched_prefcore_workfn(struct work_struct *work)
> +{
> + sched_set_itmt_support();
> +}
> +static DECLARE_WORK(sched_prefcore_work, amd_pstste_sched_prefcore_workfn);
> +
> +/**
> + * Get the highest performance register value.
> + * @cpu: CPU from which to get highest performance.
> + * @highest_perf: Return address.
> + *
> + * Return: 0 for success, -EIO otherwise.
> + */
> +static int amd_pstate_get_highest_perf(int cpu, u64 *highest_perf)
> +{
> + int ret;
> +
> + if (boot_cpu_has(X86_FEATURE_CPPC)) {
> + u64 cap1;
> +
> + ret = rdmsrl_safe_on_cpu(cpu, MSR_AMD_CPPC_CAP1, &cap1);
> + if (ret)
> + return ret;
> + WRITE_ONCE(*highest_perf, AMD_CPPC_HIGHEST_PERF(cap1));
> + } else {
> + ret = cppc_get_highest_perf(cpu, highest_perf);
> + }
> +
> + return (ret);
> +}
> +
> +static void amd_pstate_init_prefcore(void)
> +{
> + int cpu, ret;
> + u64 highest_perf;
> +
> + if (!prefcore)
> + return;
> +
> + for_each_online_cpu(cpu) {
> + ret = amd_pstate_get_highest_perf(cpu, &highest_perf);
> + if (ret)
> + break;
> +
> + sched_set_itmt_core_prio(highest_perf, cpu);
> +
> + /* check if CPPC preferred core feature is enabled*/
> + if (highest_perf == AMD_PSTATE_MAX_CPPC_PERF) {
> + prefcore = false;
> + return;
> + }
> + }
> +
> + /*
> + * This code can be run during CPU online under the
> + * CPU hotplug locks, so sched_set_amd_prefcore_support()
> + * cannot be called from here. Queue up a work item
> + * to invoke it.
> + */
> + schedule_work(&sched_prefcore_work);
> +}
> +
> static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
> {
> int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
> @@ -1037,6 +1105,12 @@ static ssize_t status_store(struct device *a, struct device_attribute *b,
> return ret < 0 ? ret : count;
> }
>
> +static ssize_t prefcore_show(struct device *dev,
> + struct device_attribute *attr, char *buf)
> +{
> + return sysfs_emit(buf, "%s\n", prefcore ? "enabled" : "disabled");
> +}
> +
> cpufreq_freq_attr_ro(amd_pstate_max_freq);
> cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq);
>
> @@ -1044,6 +1118,7 @@ cpufreq_freq_attr_ro(amd_pstate_highest_perf);
> cpufreq_freq_attr_rw(energy_performance_preference);
> cpufreq_freq_attr_ro(energy_performance_available_preferences);
> static DEVICE_ATTR_RW(status);
> +static DEVICE_ATTR_RO(prefcore);
>
> static struct freq_attr *amd_pstate_attr[] = {
> &amd_pstate_max_freq,
> @@ -1063,6 +1138,7 @@ static struct freq_attr *amd_pstate_epp_attr[] = {
>
> static struct attribute *pstate_global_attributes[] = {
> &dev_attr_status.attr,
> + &dev_attr_prefcore.attr,
> NULL
> };
>
> @@ -1506,6 +1582,8 @@ static int __init amd_pstate_init(void)
> }
> }
>
> + amd_pstate_init_prefcore();
> +
> return ret;
>
> global_attr_free:
> @@ -1527,7 +1605,17 @@ static int __init amd_pstate_param(char *str)
>
> return amd_pstate_set_driver(mode_idx);
> }
> +
> +static int __init amd_prefcore_param(char *str)
> +{
> + if (!strcmp(str, "disable"))
> + prefcore = false;
You know, the prefercore is a hardware capacity, so we should have a way to
detect current processor whether it's supported. E.X. whether we can read
highest_perf value is AMD_PSTATE_PREFCORE_THRESHOLD or less than
AMD_PSTATE_MAX_CPPC_PERF, then set the prefcore enabled.
Thanks,
Ray
> +
> + return 0;
> +}
> +
> early_param("amd_pstate", amd_pstate_param);
> +early_param("amd_prefcore", amd_prefcore_param);
>
> MODULE_AUTHOR("Huang Rui <ray.huang@amd.com>");
> MODULE_DESCRIPTION("AMD Processor P-state Frequency Driver");
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 13+ messages in thread* RE: [PATCH V4 3/7] cpufreq: amd-pstate: Enable AMD Pstate Preferred Core Supporting.
2023-08-29 8:01 ` Huang Rui
@ 2023-08-31 1:42 ` Meng, Li (Jassmine)
0 siblings, 0 replies; 13+ messages in thread
From: Meng, Li (Jassmine) @ 2023-08-31 1:42 UTC (permalink / raw)
To: Huang, Ray
Cc: Rafael J . Wysocki, linux-pm@vger.kernel.org,
linux-kernel@vger.kernel.org, x86@kernel.org,
linux-acpi@vger.kernel.org, Shuah Khan,
linux-kselftest@vger.kernel.org, Fontenot, Nathan, Sharma, Deepak,
Deucher, Alexander, Limonciello, Mario, Huang, Shimmer,
Yuan, Perry, Du, Xiaojian, Viresh Kumar, Borislav Petkov
[AMD Official Use Only - General]
Hi Ray:
> -----Original Message-----
> From: Huang, Ray <Ray.Huang@amd.com>
> Sent: Tuesday, August 29, 2023 4:01 PM
> To: Meng, Li (Jassmine) <Li.Meng@amd.com>
> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com>; linux-
> pm@vger.kernel.org; linux-kernel@vger.kernel.org; x86@kernel.org; linux-
> acpi@vger.kernel.org; Shuah Khan <skhan@linuxfoundation.org>; linux-
> kselftest@vger.kernel.org; Fontenot, Nathan
> <Nathan.Fontenot@amd.com>; Sharma, Deepak
> <Deepak.Sharma@amd.com>; Deucher, Alexander
> <Alexander.Deucher@amd.com>; Limonciello, Mario
> <Mario.Limonciello@amd.com>; Huang, Shimmer
> <Shimmer.Huang@amd.com>; Yuan, Perry <Perry.Yuan@amd.com>; Du,
> Xiaojian <Xiaojian.Du@amd.com>; Viresh Kumar <viresh.kumar@linaro.org>;
> Borislav Petkov <bp@alien8.de>
> Subject: Re: [PATCH V4 3/7] cpufreq: amd-pstate: Enable AMD Pstate
> Preferred Core Supporting.
>
> On Tue, Aug 29, 2023 at 02:43:36PM +0800, Meng, Li (Jassmine) wrote:
> > AMD Pstate driver utilizes the functions and data structures provided
> > by the ITMT architecture to enable the scheduler to favor scheduling
> > on cores which can be get a higher frequency with lower voltage. We
> > call it AMD Pstate Preferrred Core.
> >
> > Here sched_set_itmt_core_prio() is called to set priorities and
> > sched_set_itmt_support() is called to enable ITMT feature.
> > AMD Pstate driver uses the highest performance value to indicate the
> > priority of CPU. The higher value has a higher priority.
> >
> > The initial core rankings are set up by AMD Pstate when the system
> > boots.
> >
> > Add device attribute for preferred core states.
> >
> > Add one new early parameter `enable` to allow user to enable the
> > preferred core if the processor and power firmware can support
> > preferred core feature.
> >
> > Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
> > Co-developed-by: Perry Yuan <Perry.Yuan@amd.com>
> > Signed-off-by: Meng Li <li.meng@amd.com>
> > Co-developed-by: Meng Li <li.meng@amd.com>
> > Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
> > ---
> > drivers/cpufreq/amd-pstate.c | 120
> > ++++++++++++++++++++++++++++++-----
> > 1 file changed, 104 insertions(+), 16 deletions(-)
> >
> > diff --git a/drivers/cpufreq/amd-pstate.c
> > b/drivers/cpufreq/amd-pstate.c index 9a1e194d5cf8..d02305675f66 100644
> > --- a/drivers/cpufreq/amd-pstate.c
> > +++ b/drivers/cpufreq/amd-pstate.c
> > @@ -37,6 +37,7 @@
> > #include <linux/uaccess.h>
> > #include <linux/static_call.h>
> > #include <linux/amd-pstate.h>
> > +#include <linux/topology.h>
> >
> > #include <acpi/processor.h>
> > #include <acpi/cppc_acpi.h>
> > @@ -49,6 +50,8 @@
> >
> > #define AMD_PSTATE_TRANSITION_LATENCY 20000
> > #define AMD_PSTATE_TRANSITION_DELAY 1000
> > +#define AMD_PSTATE_PREFCORE_THRESHOLD 166
> > +#define AMD_PSTATE_MAX_CPPC_PERF 255
> >
> > /*
> > * TODO: We need more time to fine tune processors with shared memory
> > solution @@ -65,6 +68,9 @@ static struct cpufreq_driver
> > amd_pstate_epp_driver; static int cppc_state =
> AMD_PSTATE_UNDEFINED;
> > static bool cppc_enabled;
> >
> > +/*Preferred Core featue is supported*/ static bool prefcore = true;
> > +
> > /*
> > * AMD Energy Preference Performance (EPP)
> > * The EPP is used in the CCLK DPM controller to drive @@ -290,23
> > +296,21 @@ static inline int amd_pstate_enable(bool enable) static
> > int pstate_init_perf(struct amd_cpudata *cpudata) {
> > u64 cap1;
> > - u32 highest_perf;
> >
> > int ret = rdmsrl_safe_on_cpu(cpudata->cpu, MSR_AMD_CPPC_CAP1,
> > &cap1);
> > if (ret)
> > return ret;
> >
> > - /*
> > - * TODO: Introduce AMD specific power feature.
> > - *
> > - * CPPC entry doesn't indicate the highest performance in some
> ASICs.
> > + /* For platforms that do not support the preferred core feature, the
> > + * highest_pef may be configured with 166 or 255, to avoid max
> frequency
> > + * calculated wrongly. we take the AMD_CPPC_HIGHEST_PERF(cap1)
> value as
> > + * the default max perf.
> > */
> > - highest_perf = amd_get_highest_perf();
> > - if (highest_perf > AMD_CPPC_HIGHEST_PERF(cap1))
> > - highest_perf = AMD_CPPC_HIGHEST_PERF(cap1);
> > -
> > - WRITE_ONCE(cpudata->highest_perf, highest_perf);
> > + if (prefcore)
> > + WRITE_ONCE(cpudata->highest_perf,
> AMD_PSTATE_PREFCORE_THRESHOLD);
> > + else
> > + WRITE_ONCE(cpudata->highest_perf,
> AMD_CPPC_HIGHEST_PERF(cap1));
> >
> > WRITE_ONCE(cpudata->nominal_perf,
> AMD_CPPC_NOMINAL_PERF(cap1));
> > WRITE_ONCE(cpudata->lowest_nonlinear_perf,
> > AMD_CPPC_LOWNONLIN_PERF(cap1)); @@ -318,17 +322,15 @@ static int
> > pstate_init_perf(struct amd_cpudata *cpudata) static int
> > cppc_init_perf(struct amd_cpudata *cpudata) {
> > struct cppc_perf_caps cppc_perf;
> > - u32 highest_perf;
> >
> > int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
> > if (ret)
> > return ret;
> >
> > - highest_perf = amd_get_highest_perf();
> > - if (highest_perf > cppc_perf.highest_perf)
> > - highest_perf = cppc_perf.highest_perf;
> > -
> > - WRITE_ONCE(cpudata->highest_perf, highest_perf);
> > + if (prefcore)
> > + WRITE_ONCE(cpudata->highest_perf,
> AMD_PSTATE_PREFCORE_THRESHOLD);
> > + else
> > + WRITE_ONCE(cpudata->highest_perf,
> cppc_perf.highest_perf);
> >
> > WRITE_ONCE(cpudata->nominal_perf, cppc_perf.nominal_perf);
> > WRITE_ONCE(cpudata->lowest_nonlinear_perf,
> > @@ -676,6 +678,72 @@ static void amd_perf_ctl_reset(unsigned int cpu)
> > wrmsrl_on_cpu(cpu, MSR_AMD_PERF_CTL, 0); }
> >
> > +/*
> > + * Set AMD Pstate Preferred Core enable can't be done directly from
> > +cpufreq callbacks
> > + * due to locking, so queue the work for later.
> > + */
> > +static void amd_pstste_sched_prefcore_workfn(struct work_struct
> > +*work) {
> > + sched_set_itmt_support();
> > +}
> > +static DECLARE_WORK(sched_prefcore_work,
> > +amd_pstste_sched_prefcore_workfn);
> > +
> > +/**
> > + * Get the highest performance register value.
> > + * @cpu: CPU from which to get highest performance.
> > + * @highest_perf: Return address.
> > + *
> > + * Return: 0 for success, -EIO otherwise.
> > + */
> > +static int amd_pstate_get_highest_perf(int cpu, u64 *highest_perf) {
> > + int ret;
> > +
> > + if (boot_cpu_has(X86_FEATURE_CPPC)) {
> > + u64 cap1;
> > +
> > + ret = rdmsrl_safe_on_cpu(cpu, MSR_AMD_CPPC_CAP1, &cap1);
> > + if (ret)
> > + return ret;
> > + WRITE_ONCE(*highest_perf, AMD_CPPC_HIGHEST_PERF(cap1));
> > + } else {
> > + ret = cppc_get_highest_perf(cpu, highest_perf);
> > + }
> > +
> > + return (ret);
> > +}
> > +
> > +static void amd_pstate_init_prefcore(void) {
> > + int cpu, ret;
> > + u64 highest_perf;
> > +
> > + if (!prefcore)
> > + return;
> > +
> > + for_each_online_cpu(cpu) {
> > + ret = amd_pstate_get_highest_perf(cpu, &highest_perf);
> > + if (ret)
> > + break;
> > +
> > + sched_set_itmt_core_prio(highest_perf, cpu);
> > +
> > + /* check if CPPC preferred core feature is enabled*/
> > + if (highest_perf == AMD_PSTATE_MAX_CPPC_PERF) {
> > + prefcore = false;
> > + return;
> > + }
> > + }
> > +
> > + /*
> > + * This code can be run during CPU online under the
> > + * CPU hotplug locks, so sched_set_amd_prefcore_support()
> > + * cannot be called from here. Queue up a work item
> > + * to invoke it.
> > + */
> > + schedule_work(&sched_prefcore_work);
> > +}
> > +
> > static int amd_pstate_cpu_init(struct cpufreq_policy *policy) {
> > int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
> @@
> > -1037,6 +1105,12 @@ static ssize_t status_store(struct device *a, struct
> device_attribute *b,
> > return ret < 0 ? ret : count;
> > }
> >
> > +static ssize_t prefcore_show(struct device *dev,
> > + struct device_attribute *attr, char *buf) {
> > + return sysfs_emit(buf, "%s\n", prefcore ? "enabled" : "disabled"); }
> > +
> > cpufreq_freq_attr_ro(amd_pstate_max_freq);
> > cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq);
> >
> > @@ -1044,6 +1118,7 @@ cpufreq_freq_attr_ro(amd_pstate_highest_perf);
> > cpufreq_freq_attr_rw(energy_performance_preference);
> > cpufreq_freq_attr_ro(energy_performance_available_preferences);
> > static DEVICE_ATTR_RW(status);
> > +static DEVICE_ATTR_RO(prefcore);
> >
> > static struct freq_attr *amd_pstate_attr[] = {
> > &amd_pstate_max_freq,
> > @@ -1063,6 +1138,7 @@ static struct freq_attr *amd_pstate_epp_attr[] =
> > {
> >
> > static struct attribute *pstate_global_attributes[] = {
> > &dev_attr_status.attr,
> > + &dev_attr_prefcore.attr,
> > NULL
> > };
> >
> > @@ -1506,6 +1582,8 @@ static int __init amd_pstate_init(void)
> > }
> > }
> >
> > + amd_pstate_init_prefcore();
> > +
> > return ret;
> >
> > global_attr_free:
> > @@ -1527,7 +1605,17 @@ static int __init amd_pstate_param(char *str)
> >
> > return amd_pstate_set_driver(mode_idx); }
> > +
> > +static int __init amd_prefcore_param(char *str) {
> > + if (!strcmp(str, "disable"))
> > + prefcore = false;
>
> You know, the prefercore is a hardware capacity, so we should have a way to
> detect current processor whether it's supported. E.X. whether we can read
> highest_perf value is AMD_PSTATE_PREFCORE_THRESHOLD or less than
> AMD_PSTATE_MAX_CPPC_PERF, then set the prefcore enabled.
>
> Thanks,
> Ray
>
[Meng, Li (Jassmine)]
Yes, you are right.
Here we only provide an interface for users to disable preferred core.
Default platform enables preferred core if HW supports this feature.
When HW doesn't support this feature, the variable "prefcore" will be set false.
Only when Hw support preferred core and users set "enable", , the variable "prefcore" will be set true.
> > +
> > + return 0;
> > +}
> > +
> > early_param("amd_pstate", amd_pstate_param);
> > +early_param("amd_prefcore", amd_prefcore_param);
> >
> > MODULE_AUTHOR("Huang Rui <ray.huang@amd.com>");
> > MODULE_DESCRIPTION("AMD Processor P-state Frequency Driver");
> > --
> > 2.34.1
> >
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH V4 3/7] cpufreq: amd-pstate: Enable AMD Pstate Preferred Core Supporting.
2023-08-29 6:43 ` [PATCH V4 3/7] cpufreq: amd-pstate: Enable AMD Pstate Preferred Core Supporting Meng Li
2023-08-29 8:01 ` Huang Rui
@ 2023-08-29 14:52 ` kernel test robot
1 sibling, 0 replies; 13+ messages in thread
From: kernel test robot @ 2023-08-29 14:52 UTC (permalink / raw)
To: Meng Li, Rafael J . Wysocki, Huang Rui
Cc: llvm, oe-kbuild-all, linux-pm, linux-kernel, x86, linux-acpi,
Shuah Khan, linux-kselftest, Nathan Fontenot, Deepak Sharma,
Alex Deucher, Mario Limonciello, Shimmer Huang, Perry Yuan,
Xiaojian Du, Viresh Kumar, Borislav Petkov, Meng Li
Hi Meng,
kernel test robot noticed the following build warnings:
[auto build test WARNING on rafael-pm/linux-next]
[also build test WARNING on linus/master v6.5 next-20230829]
[cannot apply to tip/x86/core]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Meng-Li/x86-Drop-CPU_SUP_INTEL-from-SCHED_MC_PRIO-for-the-expansion/20230829-144723
base: https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git linux-next
patch link: https://lore.kernel.org/r/20230829064340.1136448-4-li.meng%40amd.com
patch subject: [PATCH V4 3/7] cpufreq: amd-pstate: Enable AMD Pstate Preferred Core Supporting.
config: x86_64-randconfig-r005-20230829 (https://download.01.org/0day-ci/archive/20230829/202308292233.XhcXfvSm-lkp@intel.com/config)
compiler: clang version 16.0.4 (https://github.com/llvm/llvm-project.git ae42196bc493ffe877a7e3dff8be32035dea4d07)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20230829/202308292233.XhcXfvSm-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202308292233.XhcXfvSm-lkp@intel.com/
All warnings (new ones prefixed by >>):
>> drivers/cpufreq/amd-pstate.c:692: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
* Get the highest performance register value.
vim +692 drivers/cpufreq/amd-pstate.c
690
691 /**
> 692 * Get the highest performance register value.
693 * @cpu: CPU from which to get highest performance.
694 * @highest_perf: Return address.
695 *
696 * Return: 0 for success, -EIO otherwise.
697 */
698 static int amd_pstate_get_highest_perf(int cpu, u64 *highest_perf)
699 {
700 int ret;
701
702 if (boot_cpu_has(X86_FEATURE_CPPC)) {
703 u64 cap1;
704
705 ret = rdmsrl_safe_on_cpu(cpu, MSR_AMD_CPPC_CAP1, &cap1);
706 if (ret)
707 return ret;
708 WRITE_ONCE(*highest_perf, AMD_CPPC_HIGHEST_PERF(cap1));
709 } else {
710 ret = cppc_get_highest_perf(cpu, highest_perf);
711 }
712
713 return (ret);
714 }
715
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH V4 4/7] cpufreq: Add a notification message that the highest perf has changed
2023-08-29 6:43 [PATCH V4 0/7] AMD Pstate Preferred Core Meng Li
` (2 preceding siblings ...)
2023-08-29 6:43 ` [PATCH V4 3/7] cpufreq: amd-pstate: Enable AMD Pstate Preferred Core Supporting Meng Li
@ 2023-08-29 6:43 ` Meng Li
2023-08-29 16:27 ` kernel test robot
2023-08-29 6:43 ` [PATCH V4 5/7] cpufreq: amd-pstate: Update AMD Pstate Preferred Core ranking dynamically Meng Li
` (2 subsequent siblings)
6 siblings, 1 reply; 13+ messages in thread
From: Meng Li @ 2023-08-29 6:43 UTC (permalink / raw)
To: Rafael J . Wysocki, Huang Rui
Cc: linux-pm, linux-kernel, x86, linux-acpi, Shuah Khan,
linux-kselftest, Nathan Fontenot, Deepak Sharma, Alex Deucher,
Mario Limonciello, Shimmer Huang, Perry Yuan, Xiaojian Du,
Viresh Kumar, Borislav Petkov, Meng Li
ACPI 6.5 section 8.4.6.1.1.1 specifies that Notify event 0x85 can be
emmitted to cause the the OSPM to re-evaluate the highest performance
register. Add support for this event.
Signed-off-by: Meng Li <li.meng@amd.com>
Link: https://uefi.org/specs/ACPI/6.5/08_Processor_Configuration_and_Control.html?highlight=cppc#cpc-continuous-performance-control
---
drivers/acpi/processor_driver.c | 6 ++++++
drivers/cpufreq/cpufreq.c | 13 +++++++++++++
include/linux/cpufreq.h | 4 ++++
3 files changed, 23 insertions(+)
diff --git a/drivers/acpi/processor_driver.c b/drivers/acpi/processor_driver.c
index 4bd16b3f0781..29b2fb68a35d 100644
--- a/drivers/acpi/processor_driver.c
+++ b/drivers/acpi/processor_driver.c
@@ -27,6 +27,7 @@
#define ACPI_PROCESSOR_NOTIFY_PERFORMANCE 0x80
#define ACPI_PROCESSOR_NOTIFY_POWER 0x81
#define ACPI_PROCESSOR_NOTIFY_THROTTLING 0x82
+#define ACPI_PROCESSOR_NOTIFY_HIGEST_PERF_CHANGED 0x85
MODULE_AUTHOR("Paul Diefenbaugh");
MODULE_DESCRIPTION("ACPI Processor Driver");
@@ -83,6 +84,11 @@ static void acpi_processor_notify(acpi_handle handle, u32 event, void *data)
acpi_bus_generate_netlink_event(device->pnp.device_class,
dev_name(&device->dev), event, 0);
break;
+ case ACPI_PROCESSOR_NOTIFY_HIGEST_PERF_CHANGED:
+ cpufreq_update_highest_perf(pr->id);
+ acpi_bus_generate_netlink_event(device->pnp.device_class,
+ dev_name(&device->dev), event, 0);
+ break;
default:
acpi_handle_debug(handle, "Unsupported event [0x%x]\n", event);
break;
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 50bbc969ffe5..842357abfae6 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -2675,6 +2675,19 @@ void cpufreq_update_limits(unsigned int cpu)
}
EXPORT_SYMBOL_GPL(cpufreq_update_limits);
+/**
+ * cpufreq_update_highest_perf - Update highest performance for a given CPU.
+ * @cpu: CPU to update the highest performance for.
+ *
+ * Invoke the driver's ->update_highest_perf callback if present
+ */
+void cpufreq_update_highest_perf(unsigned int cpu)
+{
+ if (cpufreq_driver->update_highest_perf)
+ cpufreq_driver->update_highest_perf(cpu);
+}
+EXPORT_SYMBOL_GPL(cpufreq_update_highest_perf);
+
/*********************************************************************
* BOOST *
*********************************************************************/
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 9bf94ae08158..58106b3d9183 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -232,6 +232,7 @@ int cpufreq_get_policy(struct cpufreq_policy *policy, unsigned int cpu);
void refresh_frequency_limits(struct cpufreq_policy *policy);
void cpufreq_update_policy(unsigned int cpu);
void cpufreq_update_limits(unsigned int cpu);
+void cpufreq_update_highest_perf(unsigned int cpu);
bool have_governor_per_policy(void);
bool cpufreq_supports_freq_invariance(void);
struct kobject *get_governor_parent_kobj(struct cpufreq_policy *policy);
@@ -377,6 +378,9 @@ struct cpufreq_driver {
/* Called to update policy limits on firmware notifications. */
void (*update_limits)(unsigned int cpu);
+ /* Called to update highest performance on firmware notifications. */
+ void (*update_highest_perf)(unsigned int cpu);
+
/* optional */
int (*bios_limit)(int cpu, unsigned int *limit);
--
2.34.1
^ permalink raw reply related [flat|nested] 13+ messages in thread* Re: [PATCH V4 4/7] cpufreq: Add a notification message that the highest perf has changed
2023-08-29 6:43 ` [PATCH V4 4/7] cpufreq: Add a notification message that the highest perf has changed Meng Li
@ 2023-08-29 16:27 ` kernel test robot
0 siblings, 0 replies; 13+ messages in thread
From: kernel test robot @ 2023-08-29 16:27 UTC (permalink / raw)
To: Meng Li, Rafael J . Wysocki, Huang Rui
Cc: llvm, oe-kbuild-all, linux-pm, linux-kernel, x86, linux-acpi,
Shuah Khan, linux-kselftest, Nathan Fontenot, Deepak Sharma,
Alex Deucher, Mario Limonciello, Shimmer Huang, Perry Yuan,
Xiaojian Du, Viresh Kumar, Borislav Petkov, Meng Li
Hi Meng,
kernel test robot noticed the following build errors:
[auto build test ERROR on rafael-pm/linux-next]
[also build test ERROR on linus/master v6.5 next-20230829]
[cannot apply to tip/x86/core]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Meng-Li/x86-Drop-CPU_SUP_INTEL-from-SCHED_MC_PRIO-for-the-expansion/20230829-144723
base: https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git linux-next
patch link: https://lore.kernel.org/r/20230829064340.1136448-5-li.meng%40amd.com
patch subject: [PATCH V4 4/7] cpufreq: Add a notification message that the highest perf has changed
config: i386-randconfig-r031-20230829 (https://download.01.org/0day-ci/archive/20230830/202308300057.ASUJQpsV-lkp@intel.com/config)
compiler: clang version 16.0.4 (https://github.com/llvm/llvm-project.git ae42196bc493ffe877a7e3dff8be32035dea4d07)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20230830/202308300057.ASUJQpsV-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202308300057.ASUJQpsV-lkp@intel.com/
All errors (new ones prefixed by >>):
>> drivers/acpi/processor_driver.c:88:3: error: call to undeclared function 'cpufreq_update_highest_perf'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
cpufreq_update_highest_perf(pr->id);
^
1 error generated.
vim +/cpufreq_update_highest_perf +88 drivers/acpi/processor_driver.c
53
54 static void acpi_processor_notify(acpi_handle handle, u32 event, void *data)
55 {
56 struct acpi_device *device = data;
57 struct acpi_processor *pr;
58 int saved;
59
60 if (device->handle != handle)
61 return;
62
63 pr = acpi_driver_data(device);
64 if (!pr)
65 return;
66
67 switch (event) {
68 case ACPI_PROCESSOR_NOTIFY_PERFORMANCE:
69 saved = pr->performance_platform_limit;
70 acpi_processor_ppc_has_changed(pr, 1);
71 if (saved == pr->performance_platform_limit)
72 break;
73 acpi_bus_generate_netlink_event(device->pnp.device_class,
74 dev_name(&device->dev), event,
75 pr->performance_platform_limit);
76 break;
77 case ACPI_PROCESSOR_NOTIFY_POWER:
78 acpi_processor_power_state_has_changed(pr);
79 acpi_bus_generate_netlink_event(device->pnp.device_class,
80 dev_name(&device->dev), event, 0);
81 break;
82 case ACPI_PROCESSOR_NOTIFY_THROTTLING:
83 acpi_processor_tstate_has_changed(pr);
84 acpi_bus_generate_netlink_event(device->pnp.device_class,
85 dev_name(&device->dev), event, 0);
86 break;
87 case ACPI_PROCESSOR_NOTIFY_HIGEST_PERF_CHANGED:
> 88 cpufreq_update_highest_perf(pr->id);
89 acpi_bus_generate_netlink_event(device->pnp.device_class,
90 dev_name(&device->dev), event, 0);
91 break;
92 default:
93 acpi_handle_debug(handle, "Unsupported event [0x%x]\n", event);
94 break;
95 }
96
97 return;
98 }
99
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH V4 5/7] cpufreq: amd-pstate: Update AMD Pstate Preferred Core ranking dynamically
2023-08-29 6:43 [PATCH V4 0/7] AMD Pstate Preferred Core Meng Li
` (3 preceding siblings ...)
2023-08-29 6:43 ` [PATCH V4 4/7] cpufreq: Add a notification message that the highest perf has changed Meng Li
@ 2023-08-29 6:43 ` Meng Li
2023-08-29 6:43 ` [PATCH V4 6/7] Documentation: amd-pstate: introduce AMD Pstate Preferred Core Meng Li
2023-08-29 6:43 ` [PATCH V4 7/7] Documentation: introduce AMD Pstate Preferrd Core mode kernel command line options Meng Li
6 siblings, 0 replies; 13+ messages in thread
From: Meng Li @ 2023-08-29 6:43 UTC (permalink / raw)
To: Rafael J . Wysocki, Huang Rui
Cc: linux-pm, linux-kernel, x86, linux-acpi, Shuah Khan,
linux-kselftest, Nathan Fontenot, Deepak Sharma, Alex Deucher,
Mario Limonciello, Shimmer Huang, Perry Yuan, Xiaojian Du,
Viresh Kumar, Borislav Petkov, Meng Li, Wyes Karny
Preferred core rankings can be changed dynamically by the
platform based on the workload and platform conditions and
accounting for thermals and aging.
When this occurs, cpu priority need to be set.
Signed-off-by: Meng Li <li.meng@amd.com>
Reviewed-by: Wyes Karny <wyes.karny@amd.com>
---
drivers/cpufreq/amd-pstate.c | 32 ++++++++++++++++++++++++++++++++
include/linux/amd-pstate.h | 1 +
2 files changed, 33 insertions(+)
diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index d02305675f66..8a8e4ecb1b5c 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -315,6 +315,7 @@ static int pstate_init_perf(struct amd_cpudata *cpudata)
WRITE_ONCE(cpudata->nominal_perf, AMD_CPPC_NOMINAL_PERF(cap1));
WRITE_ONCE(cpudata->lowest_nonlinear_perf, AMD_CPPC_LOWNONLIN_PERF(cap1));
WRITE_ONCE(cpudata->lowest_perf, AMD_CPPC_LOWEST_PERF(cap1));
+ WRITE_ONCE(cpudata->prefcore_highest_perf, AMD_CPPC_HIGHEST_PERF(cap1));
return 0;
}
@@ -336,6 +337,7 @@ static int cppc_init_perf(struct amd_cpudata *cpudata)
WRITE_ONCE(cpudata->lowest_nonlinear_perf,
cppc_perf.lowest_nonlinear_perf);
WRITE_ONCE(cpudata->lowest_perf, cppc_perf.lowest_perf);
+ WRITE_ONCE(cpudata->prefcore_highest_perf, cppc_perf.highest_perf);
if (cppc_state == AMD_PSTATE_ACTIVE)
return 0;
@@ -744,6 +746,34 @@ static void amd_pstate_init_prefcore(void)
schedule_work(&sched_prefcore_work);
}
+static void amd_pstate_update_highest_perf(unsigned int cpu)
+{
+ struct cpufreq_policy *policy;
+ struct amd_cpudata *cpudata;
+ u32 prev_high = 0, cur_high = 0;
+ u64 highest_perf;
+ int ret;
+
+ if (!prefcore)
+ return;
+
+ ret = amd_pstate_get_highest_perf(cpu, &highest_perf);
+ if (ret)
+ return;
+
+ policy = cpufreq_cpu_get(cpu);
+ cpudata = policy->driver_data;
+ cur_high = highest_perf;
+ prev_high = READ_ONCE(cpudata->prefcore_highest_perf);
+
+ if (prev_high != cur_high) {
+ WRITE_ONCE(cpudata->prefcore_highest_perf, cur_high);
+ sched_set_itmt_core_prio(cur_high, cpu);
+ }
+
+ cpufreq_cpu_put(policy);
+}
+
static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
{
int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
@@ -1468,6 +1498,7 @@ static struct cpufreq_driver amd_pstate_driver = {
.suspend = amd_pstate_cpu_suspend,
.resume = amd_pstate_cpu_resume,
.set_boost = amd_pstate_set_boost,
+ .update_highest_perf = amd_pstate_update_highest_perf,
.name = "amd-pstate",
.attr = amd_pstate_attr,
};
@@ -1482,6 +1513,7 @@ static struct cpufreq_driver amd_pstate_epp_driver = {
.online = amd_pstate_epp_cpu_online,
.suspend = amd_pstate_epp_suspend,
.resume = amd_pstate_epp_resume,
+ .update_highest_perf = amd_pstate_update_highest_perf,
.name = "amd-pstate-epp",
.attr = amd_pstate_epp_attr,
};
diff --git a/include/linux/amd-pstate.h b/include/linux/amd-pstate.h
index 446394f84606..fa86bc953d3e 100644
--- a/include/linux/amd-pstate.h
+++ b/include/linux/amd-pstate.h
@@ -70,6 +70,7 @@ struct amd_cpudata {
u32 nominal_perf;
u32 lowest_nonlinear_perf;
u32 lowest_perf;
+ u32 prefcore_highest_perf;
u32 max_freq;
u32 min_freq;
--
2.34.1
^ permalink raw reply related [flat|nested] 13+ messages in thread* [PATCH V4 6/7] Documentation: amd-pstate: introduce AMD Pstate Preferred Core
2023-08-29 6:43 [PATCH V4 0/7] AMD Pstate Preferred Core Meng Li
` (4 preceding siblings ...)
2023-08-29 6:43 ` [PATCH V4 5/7] cpufreq: amd-pstate: Update AMD Pstate Preferred Core ranking dynamically Meng Li
@ 2023-08-29 6:43 ` Meng Li
2023-08-29 8:07 ` Huang Rui
2023-08-29 6:43 ` [PATCH V4 7/7] Documentation: introduce AMD Pstate Preferrd Core mode kernel command line options Meng Li
6 siblings, 1 reply; 13+ messages in thread
From: Meng Li @ 2023-08-29 6:43 UTC (permalink / raw)
To: Rafael J . Wysocki, Huang Rui
Cc: linux-pm, linux-kernel, x86, linux-acpi, Shuah Khan,
linux-kselftest, Nathan Fontenot, Deepak Sharma, Alex Deucher,
Mario Limonciello, Shimmer Huang, Perry Yuan, Xiaojian Du,
Viresh Kumar, Borislav Petkov, Meng Li
Introduce AMD Pstate Preferred Core.
check preferred core state:
$ cat /sys/devices/system/cpu/amd-pstate/prefcore
Signed-off-by: Meng Li <li.meng@amd.com>
---
Documentation/admin-guide/pm/amd-pstate.rst | 54 +++++++++++++++++++++
1 file changed, 54 insertions(+)
diff --git a/Documentation/admin-guide/pm/amd-pstate.rst b/Documentation/admin-guide/pm/amd-pstate.rst
index 1cf40f69278c..400264d52007 100644
--- a/Documentation/admin-guide/pm/amd-pstate.rst
+++ b/Documentation/admin-guide/pm/amd-pstate.rst
@@ -353,6 +353,48 @@ is activated. In this mode, driver requests minimum and maximum performance
level and the platform autonomously selects a performance level in this range
and appropriate to the current workload.
+AMD Pstate Preferred Core
+=================================
+
+The core frequency is subjected to the process variation in semiconductors.
+Not all cores are able to reach the maximum frequency respecting the
+infrastructure limits. Consequently, AMD has redefined the concept of
+maximum frequency of a part. This means that a fraction of cores can reach
+maximum frequency. To find the best process scheduling policy for a given
+scenario, OS needs to know the core ordering informed by the platform through
+highest performance capability register of the CPPC interface.
+
+``AMD Pstate Preferred Core`` enables the scheduler to prefer scheduling on
+cores that can achieve a higher frequency with lower voltage. The preferred
+core rankings can dynamically change based on the workload, platform conditions,
+thermals and ageing.
+
+The priority metric will be initialized by the AMD Pstate driver. The AMD Pstate
+driver will also determine whether or not ``AMD Pstate Preferred Core`` is
+supported by the platform.
+
+AMD Pstate driver will provide an initial core ordering when the system boots.
+The platform uses the CPPC interfaces to communicate the core ranking to the
+operating system and scheduler to make sure that OS is choosing the cores
+with highest performance firstly for scheduling the process. When AMD Pstate
+driver receives a message with the highest performance change, it will
+update the core ranking and set the cpu's priority.
+
+AMD Preferred Core Switch
+=================================
+Kernel Parameters
+-----------------
+
+``AMD Pstate Preferred Core`` has two states: enable and disable.
+Enable/disable states can be chosen by different kernel parameters.
+Default enable ``AMD Pstate Preferred Core``.
+
+``amd_prefcore=disable``
+
+for systems that support ``AMD Pstate Preferred Core``, the core rankings will
+always be advertised by the platform. But OS can choose to ignore that via the
+kernel parameter ``amd_prefcore=disable``.
+
User Space Interface in ``sysfs`` - General
===========================================
@@ -385,6 +427,18 @@ control its functionality at the system level. They are located in the
to the operation mode represented by that string - or to be
unregistered in the "disable" case.
+``prefcore``
+ Preferred Core state of the driver: "enabled" or "disabled".
+
+ "enabled"
+ Enable the AMD Preferred Core.
+
+ "disabled"
+ Disable the AMD Preferred Core
+
+
+ This attribute is read-only to check the state of Preferred Core.
+
``cpupower`` tool support for ``amd-pstate``
===============================================
--
2.34.1
^ permalink raw reply related [flat|nested] 13+ messages in thread* Re: [PATCH V4 6/7] Documentation: amd-pstate: introduce AMD Pstate Preferred Core
2023-08-29 6:43 ` [PATCH V4 6/7] Documentation: amd-pstate: introduce AMD Pstate Preferred Core Meng Li
@ 2023-08-29 8:07 ` Huang Rui
0 siblings, 0 replies; 13+ messages in thread
From: Huang Rui @ 2023-08-29 8:07 UTC (permalink / raw)
To: Meng, Li (Jassmine)
Cc: Rafael J . Wysocki, linux-pm@vger.kernel.org,
linux-kernel@vger.kernel.org, x86@kernel.org,
linux-acpi@vger.kernel.org, Shuah Khan,
linux-kselftest@vger.kernel.org, Fontenot, Nathan, Sharma, Deepak,
Deucher, Alexander, Limonciello, Mario, Huang, Shimmer,
Yuan, Perry, Du, Xiaojian, Viresh Kumar, Borislav Petkov
On Tue, Aug 29, 2023 at 02:43:39PM +0800, Meng, Li (Jassmine) wrote:
> Introduce AMD Pstate Preferred Core.
>
> check preferred core state:
> $ cat /sys/devices/system/cpu/amd-pstate/prefcore
>
> Signed-off-by: Meng Li <li.meng@amd.com>
> ---
> Documentation/admin-guide/pm/amd-pstate.rst | 54 +++++++++++++++++++++
> 1 file changed, 54 insertions(+)
>
> diff --git a/Documentation/admin-guide/pm/amd-pstate.rst b/Documentation/admin-guide/pm/amd-pstate.rst
> index 1cf40f69278c..400264d52007 100644
> --- a/Documentation/admin-guide/pm/amd-pstate.rst
> +++ b/Documentation/admin-guide/pm/amd-pstate.rst
> @@ -353,6 +353,48 @@ is activated. In this mode, driver requests minimum and maximum performance
> level and the platform autonomously selects a performance level in this range
> and appropriate to the current workload.
>
> +AMD Pstate Preferred Core
> +=================================
> +
> +The core frequency is subjected to the process variation in semiconductors.
> +Not all cores are able to reach the maximum frequency respecting the
> +infrastructure limits. Consequently, AMD has redefined the concept of
> +maximum frequency of a part. This means that a fraction of cores can reach
> +maximum frequency. To find the best process scheduling policy for a given
> +scenario, OS needs to know the core ordering informed by the platform through
> +highest performance capability register of the CPPC interface.
> +
> +``AMD Pstate Preferred Core`` enables the scheduler to prefer scheduling on
> +cores that can achieve a higher frequency with lower voltage. The preferred
> +core rankings can dynamically change based on the workload, platform conditions,
> +thermals and ageing.
> +
> +The priority metric will be initialized by the AMD Pstate driver. The AMD Pstate
Please align with ``amd-pstate`` in the whole documentation.
> +driver will also determine whether or not ``AMD Pstate Preferred Core`` is
> +supported by the platform.
> +
> +AMD Pstate driver will provide an initial core ordering when the system boots.
The same here.
> +The platform uses the CPPC interfaces to communicate the core ranking to the
> +operating system and scheduler to make sure that OS is choosing the cores
> +with highest performance firstly for scheduling the process. When AMD Pstate
> +driver receives a message with the highest performance change, it will
> +update the core ranking and set the cpu's priority.
> +
> +AMD Preferred Core Switch
> +=================================
> +Kernel Parameters
> +-----------------
> +
> +``AMD Pstate Preferred Core`` has two states: enable and disable.
> +Enable/disable states can be chosen by different kernel parameters.
> +Default enable ``AMD Pstate Preferred Core``.
> +
> +``amd_prefcore=disable``
> +
> +for systems that support ``AMD Pstate Preferred Core``, the core rankings will
> +always be advertised by the platform. But OS can choose to ignore that via the
> +kernel parameter ``amd_prefcore=disable``.
As the comment in previos patch, we would better to let developers know how
to detect this function in the hardware.
Thanks,
Ray
> +
> User Space Interface in ``sysfs`` - General
> ===========================================
>
> @@ -385,6 +427,18 @@ control its functionality at the system level. They are located in the
> to the operation mode represented by that string - or to be
> unregistered in the "disable" case.
>
> +``prefcore``
> + Preferred Core state of the driver: "enabled" or "disabled".
> +
> + "enabled"
> + Enable the AMD Preferred Core.
> +
> + "disabled"
> + Disable the AMD Preferred Core
> +
> +
> + This attribute is read-only to check the state of Preferred Core.
> +
> ``cpupower`` tool support for ``amd-pstate``
> ===============================================
>
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH V4 7/7] Documentation: introduce AMD Pstate Preferrd Core mode kernel command line options
2023-08-29 6:43 [PATCH V4 0/7] AMD Pstate Preferred Core Meng Li
` (5 preceding siblings ...)
2023-08-29 6:43 ` [PATCH V4 6/7] Documentation: amd-pstate: introduce AMD Pstate Preferred Core Meng Li
@ 2023-08-29 6:43 ` Meng Li
6 siblings, 0 replies; 13+ messages in thread
From: Meng Li @ 2023-08-29 6:43 UTC (permalink / raw)
To: Rafael J . Wysocki, Huang Rui
Cc: linux-pm, linux-kernel, x86, linux-acpi, Shuah Khan,
linux-kselftest, Nathan Fontenot, Deepak Sharma, Alex Deucher,
Mario Limonciello, Shimmer Huang, Perry Yuan, Xiaojian Du,
Viresh Kumar, Borislav Petkov, Meng Li, Wyes Karny
AMD Pstate driver support enable/disable Preferred core.
Default enabled on platforms supporting AMD Preferred Core.
Disable AMD Pstate Preferred Core with
"amd_prefcore=disable" added to the kernel command line.
Signed-off-by: Meng Li <li.meng@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Wyes Karny <wyes.karny@amd.com>
---
Documentation/admin-guide/kernel-parameters.txt | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 23ebe34ff901..4f78067bb8af 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -363,6 +363,11 @@
selects a performance level in this range and appropriate
to the current workload.
+ amd_prefcore=
+ [X86]
+ disable
+ Disable AMD Pstate Preferred Core.
+
amijoy.map= [HW,JOY] Amiga joystick support
Map of devices attached to JOY0DAT and JOY1DAT
Format: <a>,<b>
--
2.34.1
^ permalink raw reply related [flat|nested] 13+ messages in thread