public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] cpufreq/amd-pstate: Set initial min_freq to  lowest_nonlinear_freq
@ 2024-10-03  8:39 Dhananjay Ugwekar
  2024-10-03  8:39 ` [PATCH 1/3] cpufreq: Add a callback to update the min_freq_req from drivers Dhananjay Ugwekar
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Dhananjay Ugwekar @ 2024-10-03  8:39 UTC (permalink / raw)
  To: gautham.shenoy, mario.limonciello, perry.yuan, ray.huang, rafael,
	viresh.kumar
  Cc: linux-pm, linux-kernel, Dhananjay Ugwekar

According to the AMD architectural programmer's manual volume 2 [1], 
in section "17.6.4.1 CPPC_CAPABILITY_1" lowest_nonlinear_perf is described 
as "Reports the most energy efficient performance level (in terms of 
performance per watt). Above this threshold, lower performance levels 
generally result in increased energy efficiency. Reducing performance 
below this threshold does not result in total energy savings for a given 
computation, although it reduces instantaneous power consumption". So 
lowest_nonlinear_perf is the most power efficient performance level, and 
going below that would lead to a worse performance/watt.

Also setting the minimum frequency to lowest_nonlinear_freq (instead of
lowest_freq) allows the CPU to idle at a higher frequency which leads
to more time being spent in a deeper idle state (as trivial idle tasks
are completed sooner). This has shown a power benefit in some systems.
In other systems, power consumption has increased but so has the
throughput/watt.

Our objective here is to update the initial lower frequency limit to 
lowest_nonlinear_freq, while allowing the user to later update the lower 
limit to anywhere between lowest_freq to highest_freq for the platform.

Currently, amd-pstate driver sets the cpudata->req[0] qos_request (lets 
call it amd_pstate_req) to the lowest_freq value at init time, and 
cpufreq.c sets the min_freq_req to 0 (which also gets resolved to the 
lowest_freq eventually). Writing to scaling_min_freq, only updates 
min_freq_req qos_request, while the amd_pstate_req always stays same as the
 initial value. This leads to the amd_pstate_req becoming the hard lower 
limit (due to the nature of priority lists used to manage the min_freq 
requests). Hence, if we update the amd_pstate_req to lowest_nonlinear_freq 
from amd-pstate driver code, user will never be able to set 
scaling_min_freq to a value lower than that.

This problem is occurring due to the existence of two different sources
of lower frequency limits, i.e. cpufreq.c and amd-pstate.c. Removing the 
cpudata->req[0], and updating the min_freq_req itself from amd-pstate
driver at init time fixes this issue and gives flexibility to the driver 
code as well as allows the user to independently update the lower limit
later on.

So, add a callback in cpufreq_driver to update the min_freq_req from 
cpufreq drivers and use it to set the initial min_freq to 
lowest_nonlinear_freq for amd-pstate driver (active, passive and guided 
modes) and cleanup the old min_freq qos request code.

Link: https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/programmer-references/24593.pdf [1]

Dhananjay Ugwekar (3):
  cpufreq: Add a callback to update the min_freq_req from drivers
  cpufreq/amd-pstate: Set the initial min_freq to lowest_nonlinear_freq
  cpufreq/amd-pstate: Cleanup the old min_freq qos request remnants

 drivers/cpufreq/amd-pstate.c | 35 +++++++++++++++++------------------
 drivers/cpufreq/amd-pstate.h |  4 ++--
 drivers/cpufreq/cpufreq.c    |  6 +++++-
 include/linux/cpufreq.h      |  6 ++++++
 4 files changed, 30 insertions(+), 21 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/3] cpufreq: Add a callback to update the min_freq_req from drivers
  2024-10-03  8:39 [PATCH 0/3] cpufreq/amd-pstate: Set initial min_freq to lowest_nonlinear_freq Dhananjay Ugwekar
@ 2024-10-03  8:39 ` Dhananjay Ugwekar
  2024-10-04  6:41   ` Gautham R. Shenoy
                     ` (2 more replies)
  2024-10-03  8:39 ` [PATCH 2/3] cpufreq/amd-pstate: Set the initial min_freq to lowest_nonlinear_freq Dhananjay Ugwekar
                   ` (2 subsequent siblings)
  3 siblings, 3 replies; 16+ messages in thread
From: Dhananjay Ugwekar @ 2024-10-03  8:39 UTC (permalink / raw)
  To: gautham.shenoy, mario.limonciello, perry.yuan, ray.huang, rafael,
	viresh.kumar
  Cc: linux-pm, linux-kernel, Dhananjay Ugwekar

Currently, there is no proper way to update the initial lower frequency
limit from cpufreq drivers. Only way is to add a new min_freq qos
request from the driver side, but it leads to the issue explained below.

The QoS infrastructure collates the constraints from multiple
subsystems and saves them in a plist. The "current value" is defined to
be the highest value in the plist for min_freq constraint.

The cpufreq core adds a qos_request for min_freq to be 0 and the amd-pstate
driver today adds qos request for min_freq to be lowest_freq, where
lowest_freq corresponds to CPPC.lowest_perf.

Eg: Suppose WLOG considering amd-pstate driver, lowest_freq is 400000 KHz,
lowest_non_linear_freq is 1200000 KHz.

At this point of time, the min_freq QoS plist looks like:

head--> 400000 KHz (registered by amd-pstate) --> 0 KHz (registered by
cpufreq core)

When a user updates /sys/devices/system/cpu/cpuX/cpufreq/scaling_min_freq,
it only results in updating the cpufreq-core's node in the plist, where
say 0 becomes the newly echoed value.

Now, if the user echoes a value 1000000 KHz, to scaling_min_freq, then the
new list would be

head--> 1000000 KHz (registered by cpufreq core) --> 400000 KHz (registered
by amd-pstate)

and the new "current value" of the min_freq QoS constraint will be 1000000
KHz, this is the scenario where it works as expected.

Suppose we change the amd-pstate driver code's min_freq qos constraint
to lowest_non_linear_freq instead of lowest_freq, then the user will
never be able to request a value below that, due to the following:

At boot time, the min_freq QoS plist would be

head--> 1200000 KHz (registered by amd-pstate) --> 0 KHz (registered by 
cpufreq core)

When the user echoes a value of 1000000 KHz, to
/sys/devices/..../scaling_min_freq, then the new list would be

head--> 1200000 KHz (registered by amd-pstate) --> 1000000 KHz (registered 
by cpufreq core)

with the new "current value" of the min_freq QoS remaining 1200000 KHz.
Since the current value has not changed, there won't be any notifications
sent to the subsystems which have added their QoS constraints. In
particular, the amd-pstate driver will not get the notification, and thus,
the user's request to lower the scaling_min_freq will be ineffective.

Hence, it is advisable to have a single source of truth for the min and
max freq QoS constraints between the cpufreq and the cpufreq drivers.

So add a new callback get_init_min_freq() add in struct cpufreq_driver,
which allows amd-pstate (or any other cpufreq driver) to override the
default min_freq value being set in the policy->min_freq_req. Now
scaling_min_freq can be modified by the user to any value (lower or
higher than the init value) later on if desired.

Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
---
 drivers/cpufreq/cpufreq.c | 6 +++++-
 include/linux/cpufreq.h   | 6 ++++++
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index f98c9438760c..2923068cf5f4 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1361,6 +1361,7 @@ static int cpufreq_online(unsigned int cpu)
 	bool new_policy;
 	unsigned long flags;
 	unsigned int j;
+	u32 init_min_freq = FREQ_QOS_MIN_DEFAULT_VALUE;
 	int ret;
 
 	pr_debug("%s: bringing CPU%u online\n", __func__, cpu);
@@ -1445,9 +1446,12 @@ static int cpufreq_online(unsigned int cpu)
 			goto out_destroy_policy;
 		}
 
+		if (cpufreq_driver->get_init_min_freq)
+			init_min_freq = cpufreq_driver->get_init_min_freq(policy);
+
 		ret = freq_qos_add_request(&policy->constraints,
 					   policy->min_freq_req, FREQ_QOS_MIN,
-					   FREQ_QOS_MIN_DEFAULT_VALUE);
+					   init_min_freq);
 		if (ret < 0) {
 			/*
 			 * So we don't call freq_qos_remove_request() for an
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index e0e19d9c1323..b20488b55f6c 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -414,6 +414,12 @@ struct cpufreq_driver {
 	 * policy is properly initialized, but before the governor is started.
 	 */
 	void		(*register_em)(struct cpufreq_policy *policy);
+
+	/*
+	 * Set by drivers that want to initialize the policy->min_freq_req with
+	 * a value different from the default value (0) in cpufreq core.
+	 */
+	int		(*get_init_min_freq)(struct cpufreq_policy *policy);
 };
 
 /* flags */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/3] cpufreq/amd-pstate: Set the initial min_freq to lowest_nonlinear_freq
  2024-10-03  8:39 [PATCH 0/3] cpufreq/amd-pstate: Set initial min_freq to lowest_nonlinear_freq Dhananjay Ugwekar
  2024-10-03  8:39 ` [PATCH 1/3] cpufreq: Add a callback to update the min_freq_req from drivers Dhananjay Ugwekar
@ 2024-10-03  8:39 ` Dhananjay Ugwekar
  2024-10-04  8:47   ` Gautham R. Shenoy
  2024-10-03  8:39 ` [PATCH 3/3] cpufreq/amd-pstate: Cleanup the old min_freq qos request remnants Dhananjay Ugwekar
  2024-10-03 19:23 ` [PATCH 0/3] cpufreq/amd-pstate: Set initial min_freq to lowest_nonlinear_freq Mario Limonciello
  3 siblings, 1 reply; 16+ messages in thread
From: Dhananjay Ugwekar @ 2024-10-03  8:39 UTC (permalink / raw)
  To: gautham.shenoy, mario.limonciello, perry.yuan, ray.huang, rafael,
	viresh.kumar
  Cc: linux-pm, linux-kernel, Dhananjay Ugwekar

According to the AMD architectural programmer's manual volume 2 [1], in
section "17.6.4.1 CPPC_CAPABILITY_1" lowest_nonlinear_perf is described
as "Reports the most energy efficient performance level (in terms of
performance per watt). Above this threshold, lower performance levels
generally result in increased energy efficiency. Reducing performance
below this threshold does not result in total energy savings for a given
computation, although it reduces instantaneous power consumption". So
lowest_nonlinear_perf is the most power efficient performance level, and
going below that would lead to a worse performance/watt.

Also, setting the minimum frequency to lowest_nonlinear_freq (instead of
lowest_freq) allows the CPU to idle at a higher frequency which leads
to more time being spent in a deeper idle state (as trivial idle tasks
are completed sooner). This has shown a power benefit in some systems,
in other systems, power consumption has increased but so has the
throughput/watt.

Use the get_init_min_freq() callback to set the initial lower limit for
amd-pstate driver to lowest_nonlinear_freq instead of lowest_freq.

Link: https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/programmer-references/24593.pdf [1]

Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
---
 drivers/cpufreq/amd-pstate.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index b7a17a3ef122..f8abae9a0156 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -995,13 +995,6 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
 	if (cpu_feature_enabled(X86_FEATURE_CPPC))
 		policy->fast_switch_possible = true;
 
-	ret = freq_qos_add_request(&policy->constraints, &cpudata->req[0],
-				   FREQ_QOS_MIN, policy->cpuinfo.min_freq);
-	if (ret < 0) {
-		dev_err(dev, "Failed to add min-freq constraint (%d)\n", ret);
-		goto free_cpudata1;
-	}
-
 	ret = freq_qos_add_request(&policy->constraints, &cpudata->req[1],
 				   FREQ_QOS_MAX, policy->cpuinfo.max_freq);
 	if (ret < 0) {
@@ -1706,6 +1699,13 @@ static int amd_pstate_epp_resume(struct cpufreq_policy *policy)
 	return 0;
 }
 
+static int amd_pstate_get_init_min_freq(struct cpufreq_policy *policy)
+{
+	struct amd_cpudata *cpudata = policy->driver_data;
+
+	return READ_ONCE(cpudata->lowest_nonlinear_freq);
+}
+
 static struct cpufreq_driver amd_pstate_driver = {
 	.flags		= CPUFREQ_CONST_LOOPS | CPUFREQ_NEED_UPDATE_LIMITS,
 	.verify		= amd_pstate_verify,
@@ -1719,6 +1719,7 @@ static struct cpufreq_driver amd_pstate_driver = {
 	.update_limits	= amd_pstate_update_limits,
 	.name		= "amd-pstate",
 	.attr		= amd_pstate_attr,
+	.get_init_min_freq = amd_pstate_get_init_min_freq,
 };
 
 static struct cpufreq_driver amd_pstate_epp_driver = {
@@ -1735,6 +1736,7 @@ static struct cpufreq_driver amd_pstate_epp_driver = {
 	.set_boost	= amd_pstate_set_boost,
 	.name		= "amd-pstate-epp",
 	.attr		= amd_pstate_epp_attr,
+	.get_init_min_freq = amd_pstate_get_init_min_freq,
 };
 
 static int __init amd_pstate_set_driver(int mode_idx)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/3] cpufreq/amd-pstate: Cleanup the old min_freq qos request remnants
  2024-10-03  8:39 [PATCH 0/3] cpufreq/amd-pstate: Set initial min_freq to lowest_nonlinear_freq Dhananjay Ugwekar
  2024-10-03  8:39 ` [PATCH 1/3] cpufreq: Add a callback to update the min_freq_req from drivers Dhananjay Ugwekar
  2024-10-03  8:39 ` [PATCH 2/3] cpufreq/amd-pstate: Set the initial min_freq to lowest_nonlinear_freq Dhananjay Ugwekar
@ 2024-10-03  8:39 ` Dhananjay Ugwekar
  2024-10-03 19:23 ` [PATCH 0/3] cpufreq/amd-pstate: Set initial min_freq to lowest_nonlinear_freq Mario Limonciello
  3 siblings, 0 replies; 16+ messages in thread
From: Dhananjay Ugwekar @ 2024-10-03  8:39 UTC (permalink / raw)
  To: gautham.shenoy, mario.limonciello, perry.yuan, ray.huang, rafael,
	viresh.kumar
  Cc: linux-pm, linux-kernel, Dhananjay Ugwekar

Convert the freq_qos_request array in struct amd_cpudata to a single
variable (only for max_freq request). Remove the references to cpudata->req
array. Remove and rename the jump labels accordingly.

Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
---
 drivers/cpufreq/amd-pstate.c | 19 ++++++++-----------
 drivers/cpufreq/amd-pstate.h |  4 ++--
 2 files changed, 10 insertions(+), 13 deletions(-)

diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index f8abae9a0156..cdc08d2ddd52 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -696,7 +696,7 @@ static int amd_pstate_cpu_boost_update(struct cpufreq_policy *policy, bool on)
 	policy->max = policy->cpuinfo.max_freq;
 
 	if (cppc_state == AMD_PSTATE_PASSIVE) {
-		ret = freq_qos_update_request(&cpudata->req[1], policy->cpuinfo.max_freq);
+		ret = freq_qos_update_request(&cpudata->max_freq_req, policy->cpuinfo.max_freq);
 		if (ret < 0)
 			pr_debug("Failed to update freq constraint: CPU%d\n", cpudata->cpu);
 	}
@@ -963,17 +963,17 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
 
 	ret = amd_pstate_init_perf(cpudata);
 	if (ret)
-		goto free_cpudata1;
+		goto free_cpudata;
 
 	amd_pstate_init_prefcore(cpudata);
 
 	ret = amd_pstate_init_freq(cpudata);
 	if (ret)
-		goto free_cpudata1;
+		goto free_cpudata;
 
 	ret = amd_pstate_init_boost_support(cpudata);
 	if (ret)
-		goto free_cpudata1;
+		goto free_cpudata;
 
 	min_freq = READ_ONCE(cpudata->min_freq);
 	max_freq = READ_ONCE(cpudata->max_freq);
@@ -995,11 +995,11 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
 	if (cpu_feature_enabled(X86_FEATURE_CPPC))
 		policy->fast_switch_possible = true;
 
-	ret = freq_qos_add_request(&policy->constraints, &cpudata->req[1],
+	ret = freq_qos_add_request(&policy->constraints, &cpudata->max_freq_req,
 				   FREQ_QOS_MAX, policy->cpuinfo.max_freq);
 	if (ret < 0) {
 		dev_err(dev, "Failed to add max-freq constraint (%d)\n", ret);
-		goto free_cpudata2;
+		goto free_cpudata;
 	}
 
 	cpudata->max_limit_freq = max_freq;
@@ -1012,9 +1012,7 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
 
 	return 0;
 
-free_cpudata2:
-	freq_qos_remove_request(&cpudata->req[0]);
-free_cpudata1:
+free_cpudata:
 	kfree(cpudata);
 	return ret;
 }
@@ -1023,8 +1021,7 @@ static void amd_pstate_cpu_exit(struct cpufreq_policy *policy)
 {
 	struct amd_cpudata *cpudata = policy->driver_data;
 
-	freq_qos_remove_request(&cpudata->req[1]);
-	freq_qos_remove_request(&cpudata->req[0]);
+	freq_qos_remove_request(&cpudata->max_freq_req);
 	policy->fast_switch_possible = false;
 	kfree(cpudata);
 }
diff --git a/drivers/cpufreq/amd-pstate.h b/drivers/cpufreq/amd-pstate.h
index cd573bc6b6db..643e2a71827d 100644
--- a/drivers/cpufreq/amd-pstate.h
+++ b/drivers/cpufreq/amd-pstate.h
@@ -28,7 +28,7 @@ struct amd_aperf_mperf {
 /**
  * struct amd_cpudata - private CPU data for AMD P-State
  * @cpu: CPU number
- * @req: constraint request to apply
+ * @max_freq_req: maximum frequency constraint request to apply
  * @cppc_req_cached: cached performance request hints
  * @highest_perf: the maximum performance an individual processor may reach,
  *		  assuming ideal conditions
@@ -68,7 +68,7 @@ struct amd_aperf_mperf {
 struct amd_cpudata {
 	int	cpu;
 
-	struct	freq_qos_request req[2];
+	struct	freq_qos_request max_freq_req;
 	u64	cppc_req_cached;
 
 	u32	highest_perf;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/3] cpufreq/amd-pstate: Set initial min_freq to lowest_nonlinear_freq
  2024-10-03  8:39 [PATCH 0/3] cpufreq/amd-pstate: Set initial min_freq to lowest_nonlinear_freq Dhananjay Ugwekar
                   ` (2 preceding siblings ...)
  2024-10-03  8:39 ` [PATCH 3/3] cpufreq/amd-pstate: Cleanup the old min_freq qos request remnants Dhananjay Ugwekar
@ 2024-10-03 19:23 ` Mario Limonciello
  3 siblings, 0 replies; 16+ messages in thread
From: Mario Limonciello @ 2024-10-03 19:23 UTC (permalink / raw)
  To: Rafael J. Wysocki, Viresh Kumar
  Cc: linux-pm, linux-kernel, Dhananjay Ugwekar, gautham.shenoy,
	perry.yuan, ray.huang

On 10/3/2024 03:39, Dhananjay Ugwekar wrote:
> According to the AMD architectural programmer's manual volume 2 [1],
> in section "17.6.4.1 CPPC_CAPABILITY_1" lowest_nonlinear_perf is described
> as "Reports the most energy efficient performance level (in terms of
> performance per watt). Above this threshold, lower performance levels
> generally result in increased energy efficiency. Reducing performance
> below this threshold does not result in total energy savings for a given
> computation, although it reduces instantaneous power consumption". So
> lowest_nonlinear_perf is the most power efficient performance level, and
> going below that would lead to a worse performance/watt.
> 
> Also setting the minimum frequency to lowest_nonlinear_freq (instead of
> lowest_freq) allows the CPU to idle at a higher frequency which leads
> to more time being spent in a deeper idle state (as trivial idle tasks
> are completed sooner). This has shown a power benefit in some systems.
> In other systems, power consumption has increased but so has the
> throughput/watt.
> 
> Our objective here is to update the initial lower frequency limit to
> lowest_nonlinear_freq, while allowing the user to later update the lower
> limit to anywhere between lowest_freq to highest_freq for the platform.
> 
> Currently, amd-pstate driver sets the cpudata->req[0] qos_request (lets
> call it amd_pstate_req) to the lowest_freq value at init time, and
> cpufreq.c sets the min_freq_req to 0 (which also gets resolved to the
> lowest_freq eventually). Writing to scaling_min_freq, only updates
> min_freq_req qos_request, while the amd_pstate_req always stays same as the
>   initial value. This leads to the amd_pstate_req becoming the hard lower
> limit (due to the nature of priority lists used to manage the min_freq
> requests). Hence, if we update the amd_pstate_req to lowest_nonlinear_freq
> from amd-pstate driver code, user will never be able to set
> scaling_min_freq to a value lower than that.
> 
> This problem is occurring due to the existence of two different sources
> of lower frequency limits, i.e. cpufreq.c and amd-pstate.c. Removing the
> cpudata->req[0], and updating the min_freq_req itself from amd-pstate
> driver at init time fixes this issue and gives flexibility to the driver
> code as well as allows the user to independently update the lower limit
> later on.
> 
> So, add a callback in cpufreq_driver to update the min_freq_req from
> cpufreq drivers and use it to set the initial min_freq to
> lowest_nonlinear_freq for amd-pstate driver (active, passive and guided
> modes) and cleanup the old min_freq qos request code.
> 
> Link: https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/programmer-references/24593.pdf [1]
> 
> Dhananjay Ugwekar (3):
>    cpufreq: Add a callback to update the min_freq_req from drivers
>    cpufreq/amd-pstate: Set the initial min_freq to lowest_nonlinear_freq
>    cpufreq/amd-pstate: Cleanup the old min_freq qos request remnants
> 
>   drivers/cpufreq/amd-pstate.c | 35 +++++++++++++++++------------------
>   drivers/cpufreq/amd-pstate.h |  4 ++--
>   drivers/cpufreq/cpufreq.c    |  6 +++++-
>   include/linux/cpufreq.h      |  6 ++++++
>   4 files changed, 30 insertions(+), 21 deletions(-)
> 

Thanks for the series, it looks good to me and from my testing works as 
intended.

For the series:
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>

Rafael, Viresh,

Can I get your A-b on patch 1?  I'll take it with other new amd-pstate 
content coming this cycle then.

Thansk,

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] cpufreq: Add a callback to update the min_freq_req from drivers
  2024-10-03  8:39 ` [PATCH 1/3] cpufreq: Add a callback to update the min_freq_req from drivers Dhananjay Ugwekar
@ 2024-10-04  6:41   ` Gautham R. Shenoy
  2024-10-04 18:17   ` Rafael J. Wysocki
  2024-10-10  7:35   ` Viresh Kumar
  2 siblings, 0 replies; 16+ messages in thread
From: Gautham R. Shenoy @ 2024-10-04  6:41 UTC (permalink / raw)
  To: Dhananjay Ugwekar
  Cc: mario.limonciello, perry.yuan, ray.huang, rafael, viresh.kumar,
	linux-pm, linux-kernel

On Thu, Oct 03, 2024 at 08:39:52AM +0000, Dhananjay Ugwekar wrote:
> Currently, there is no proper way to update the initial lower frequency
> limit from cpufreq drivers. Only way is to add a new min_freq qos
> request from the driver side, but it leads to the issue explained below.
> 
> The QoS infrastructure collates the constraints from multiple
> subsystems and saves them in a plist. The "current value" is defined to
> be the highest value in the plist for min_freq constraint.
> 
> The cpufreq core adds a qos_request for min_freq to be 0 and the amd-pstate
> driver today adds qos request for min_freq to be lowest_freq, where
> lowest_freq corresponds to CPPC.lowest_perf.
> 
> Eg: Suppose WLOG considering amd-pstate driver, lowest_freq is 400000 KHz,
> lowest_non_linear_freq is 1200000 KHz.
> 
> At this point of time, the min_freq QoS plist looks like:
> 
> head--> 400000 KHz (registered by amd-pstate) --> 0 KHz (registered by
> cpufreq core)
> 
> When a user updates /sys/devices/system/cpu/cpuX/cpufreq/scaling_min_freq,
> it only results in updating the cpufreq-core's node in the plist, where
> say 0 becomes the newly echoed value.
> 
> Now, if the user echoes a value 1000000 KHz, to scaling_min_freq, then the
> new list would be
> 
> head--> 1000000 KHz (registered by cpufreq core) --> 400000 KHz (registered
> by amd-pstate)
> 
> and the new "current value" of the min_freq QoS constraint will be 1000000
> KHz, this is the scenario where it works as expected.
> 
> Suppose we change the amd-pstate driver code's min_freq qos constraint
> to lowest_non_linear_freq instead of lowest_freq, then the user will
> never be able to request a value below that, due to the following:
> 
> At boot time, the min_freq QoS plist would be
> 
> head--> 1200000 KHz (registered by amd-pstate) --> 0 KHz (registered by 
> cpufreq core)
> 
> When the user echoes a value of 1000000 KHz, to
> /sys/devices/..../scaling_min_freq, then the new list would be
> 
> head--> 1200000 KHz (registered by amd-pstate) --> 1000000 KHz (registered 
> by cpufreq core)
> 
> with the new "current value" of the min_freq QoS remaining 1200000 KHz.
> Since the current value has not changed, there won't be any notifications
> sent to the subsystems which have added their QoS constraints. In
> particular, the amd-pstate driver will not get the notification, and thus,
> the user's request to lower the scaling_min_freq will be ineffective.
> 
> Hence, it is advisable to have a single source of truth for the min and
> max freq QoS constraints between the cpufreq and the cpufreq drivers.
> 
> So add a new callback get_init_min_freq() add in struct cpufreq_driver,
> which allows amd-pstate (or any other cpufreq driver) to override the
> default min_freq value being set in the policy->min_freq_req. Now
> scaling_min_freq can be modified by the user to any value (lower or
> higher than the init value) later on if desired.

Looks good to me.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>

Rafael and Viresh, could you please weigh in on this new callback.

--
Thanks and Regards
gautham.

> 
> Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
> ---
>  drivers/cpufreq/cpufreq.c | 6 +++++-
>  include/linux/cpufreq.h   | 6 ++++++
>  2 files changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index f98c9438760c..2923068cf5f4 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1361,6 +1361,7 @@ static int cpufreq_online(unsigned int cpu)
>  	bool new_policy;
>  	unsigned long flags;
>  	unsigned int j;
> +	u32 init_min_freq = FREQ_QOS_MIN_DEFAULT_VALUE;
>  	int ret;
>  
>  	pr_debug("%s: bringing CPU%u online\n", __func__, cpu);
> @@ -1445,9 +1446,12 @@ static int cpufreq_online(unsigned int cpu)
>  			goto out_destroy_policy;
>  		}
>  
> +		if (cpufreq_driver->get_init_min_freq)
> +			init_min_freq = cpufreq_driver->get_init_min_freq(policy);
> +
>  		ret = freq_qos_add_request(&policy->constraints,
>  					   policy->min_freq_req, FREQ_QOS_MIN,
> -					   FREQ_QOS_MIN_DEFAULT_VALUE);
> +					   init_min_freq);
>  		if (ret < 0) {
>  			/*
>  			 * So we don't call freq_qos_remove_request() for an
> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
> index e0e19d9c1323..b20488b55f6c 100644
> --- a/include/linux/cpufreq.h
> +++ b/include/linux/cpufreq.h
> @@ -414,6 +414,12 @@ struct cpufreq_driver {
>  	 * policy is properly initialized, but before the governor is started.
>  	 */
>  	void		(*register_em)(struct cpufreq_policy *policy);
> +
> +	/*
> +	 * Set by drivers that want to initialize the policy->min_freq_req with
> +	 * a value different from the default value (0) in cpufreq core.
> +	 */
> +	int		(*get_init_min_freq)(struct cpufreq_policy *policy);
>  };
>  
>  /* flags */
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/3] cpufreq/amd-pstate: Set the initial min_freq to lowest_nonlinear_freq
  2024-10-03  8:39 ` [PATCH 2/3] cpufreq/amd-pstate: Set the initial min_freq to lowest_nonlinear_freq Dhananjay Ugwekar
@ 2024-10-04  8:47   ` Gautham R. Shenoy
  0 siblings, 0 replies; 16+ messages in thread
From: Gautham R. Shenoy @ 2024-10-04  8:47 UTC (permalink / raw)
  To: Dhananjay Ugwekar
  Cc: mario.limonciello, perry.yuan, ray.huang, rafael, viresh.kumar,
	linux-pm, linux-kernel

On Thu, Oct 03, 2024 at 08:39:54AM +0000, Dhananjay Ugwekar wrote:
> According to the AMD architectural programmer's manual volume 2 [1], in
> section "17.6.4.1 CPPC_CAPABILITY_1" lowest_nonlinear_perf is described
> as "Reports the most energy efficient performance level (in terms of
> performance per watt). Above this threshold, lower performance levels
> generally result in increased energy efficiency. Reducing performance
> below this threshold does not result in total energy savings for a given
> computation, although it reduces instantaneous power consumption". So
> lowest_nonlinear_perf is the most power efficient performance level, and
> going below that would lead to a worse performance/watt.
> 
> Also, setting the minimum frequency to lowest_nonlinear_freq (instead of
> lowest_freq) allows the CPU to idle at a higher frequency which leads
> to more time being spent in a deeper idle state (as trivial idle tasks
> are completed sooner). This has shown a power benefit in some systems,
> in other systems, power consumption has increased but so has the
> throughput/watt.
> 
> Use the get_init_min_freq() callback to set the initial lower limit for
> amd-pstate driver to lowest_nonlinear_freq instead of lowest_freq.
> 
> Link: https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/programmer-references/24593.pdf [1]

It is good enough to quote "AMD64 Architecture Programmer's Manual,
Volume 2, Revision 3.42" instead of providing a link.

Other than that,

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>



> 
> Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>


--
Thanks and Regards
gautham.

> ---
>  drivers/cpufreq/amd-pstate.c | 16 +++++++++-------
>  1 file changed, 9 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index b7a17a3ef122..f8abae9a0156 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -995,13 +995,6 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
>  	if (cpu_feature_enabled(X86_FEATURE_CPPC))
>  		policy->fast_switch_possible = true;
>  
> -	ret = freq_qos_add_request(&policy->constraints, &cpudata->req[0],
> -				   FREQ_QOS_MIN, policy->cpuinfo.min_freq);
> -	if (ret < 0) {
> -		dev_err(dev, "Failed to add min-freq constraint (%d)\n", ret);
> -		goto free_cpudata1;
> -	}
> -
>  	ret = freq_qos_add_request(&policy->constraints, &cpudata->req[1],
>  				   FREQ_QOS_MAX, policy->cpuinfo.max_freq);
>  	if (ret < 0) {
> @@ -1706,6 +1699,13 @@ static int amd_pstate_epp_resume(struct cpufreq_policy *policy)
>  	return 0;
>  }
>  
> +static int amd_pstate_get_init_min_freq(struct cpufreq_policy *policy)
> +{
> +	struct amd_cpudata *cpudata = policy->driver_data;
> +
> +	return READ_ONCE(cpudata->lowest_nonlinear_freq);
> +}
> +
>  static struct cpufreq_driver amd_pstate_driver = {
>  	.flags		= CPUFREQ_CONST_LOOPS | CPUFREQ_NEED_UPDATE_LIMITS,
>  	.verify		= amd_pstate_verify,
> @@ -1719,6 +1719,7 @@ static struct cpufreq_driver amd_pstate_driver = {
>  	.update_limits	= amd_pstate_update_limits,
>  	.name		= "amd-pstate",
>  	.attr		= amd_pstate_attr,
> +	.get_init_min_freq = amd_pstate_get_init_min_freq,
>  };
>  
>  static struct cpufreq_driver amd_pstate_epp_driver = {
> @@ -1735,6 +1736,7 @@ static struct cpufreq_driver amd_pstate_epp_driver = {
>  	.set_boost	= amd_pstate_set_boost,
>  	.name		= "amd-pstate-epp",
>  	.attr		= amd_pstate_epp_attr,
> +	.get_init_min_freq = amd_pstate_get_init_min_freq,
>  };
>  
>  static int __init amd_pstate_set_driver(int mode_idx)
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] cpufreq: Add a callback to update the min_freq_req from drivers
  2024-10-03  8:39 ` [PATCH 1/3] cpufreq: Add a callback to update the min_freq_req from drivers Dhananjay Ugwekar
  2024-10-04  6:41   ` Gautham R. Shenoy
@ 2024-10-04 18:17   ` Rafael J. Wysocki
  2024-10-07  4:40     ` Dhananjay Ugwekar
  2024-10-10  7:35   ` Viresh Kumar
  2 siblings, 1 reply; 16+ messages in thread
From: Rafael J. Wysocki @ 2024-10-04 18:17 UTC (permalink / raw)
  To: Dhananjay Ugwekar
  Cc: gautham.shenoy, mario.limonciello, perry.yuan, ray.huang, rafael,
	viresh.kumar, linux-pm, linux-kernel

On Thu, Oct 3, 2024 at 10:44 AM Dhananjay Ugwekar
<Dhananjay.Ugwekar@amd.com> wrote:
>
> Currently, there is no proper way to update the initial lower frequency
> limit from cpufreq drivers.

Why do you want to do it?

> Only way is to add a new min_freq qos
> request from the driver side, but it leads to the issue explained below.
>
> The QoS infrastructure collates the constraints from multiple
> subsystems and saves them in a plist. The "current value" is defined to
> be the highest value in the plist for min_freq constraint.
>
> The cpufreq core adds a qos_request for min_freq to be 0 and the amd-pstate
> driver today adds qos request for min_freq to be lowest_freq, where
> lowest_freq corresponds to CPPC.lowest_perf.
>
> Eg: Suppose WLOG considering amd-pstate driver, lowest_freq is 400000 KHz,
> lowest_non_linear_freq is 1200000 KHz.
>
> At this point of time, the min_freq QoS plist looks like:
>
> head--> 400000 KHz (registered by amd-pstate) --> 0 KHz (registered by
> cpufreq core)
>
> When a user updates /sys/devices/system/cpu/cpuX/cpufreq/scaling_min_freq,
> it only results in updating the cpufreq-core's node in the plist, where
> say 0 becomes the newly echoed value.
>
> Now, if the user echoes a value 1000000 KHz, to scaling_min_freq, then the
> new list would be
>
> head--> 1000000 KHz (registered by cpufreq core) --> 400000 KHz (registered
> by amd-pstate)
>
> and the new "current value" of the min_freq QoS constraint will be 1000000
> KHz, this is the scenario where it works as expected.
>
> Suppose we change the amd-pstate driver code's min_freq qos constraint
> to lowest_non_linear_freq instead of lowest_freq, then the user will
> never be able to request a value below that, due to the following:
>
> At boot time, the min_freq QoS plist would be
>
> head--> 1200000 KHz (registered by amd-pstate) --> 0 KHz (registered by
> cpufreq core)
>
> When the user echoes a value of 1000000 KHz, to
> /sys/devices/..../scaling_min_freq, then the new list would be
>
> head--> 1200000 KHz (registered by amd-pstate) --> 1000000 KHz (registered
> by cpufreq core)
>
> with the new "current value" of the min_freq QoS remaining 1200000 KHz.

Yes, that's how frequency QoS works.

> Since the current value has not changed, there won't be any notifications
> sent to the subsystems which have added their QoS constraints. In
> particular, the amd-pstate driver will not get the notification, and thus,
> the user's request to lower the scaling_min_freq will be ineffective.

The value written by user space to scaling_min_freq is a vote, not a
request.  It may not be physically possible to reduce the frequency
below a certain minimum level that need not be known to the user.

> Hence, it is advisable to have a single source of truth for the min and
> max freq QoS constraints between the cpufreq and the cpufreq drivers.
>
> So add a new callback get_init_min_freq() add in struct cpufreq_driver,
> which allows amd-pstate (or any other cpufreq driver) to override the
> default min_freq value being set in the policy->min_freq_req. Now
> scaling_min_freq can be modified by the user to any value (lower or
> higher than the init value) later on if desired.
>
> Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
> ---
>  drivers/cpufreq/cpufreq.c | 6 +++++-
>  include/linux/cpufreq.h   | 6 ++++++
>  2 files changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index f98c9438760c..2923068cf5f4 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1361,6 +1361,7 @@ static int cpufreq_online(unsigned int cpu)
>         bool new_policy;
>         unsigned long flags;
>         unsigned int j;
> +       u32 init_min_freq = FREQ_QOS_MIN_DEFAULT_VALUE;
>         int ret;
>
>         pr_debug("%s: bringing CPU%u online\n", __func__, cpu);
> @@ -1445,9 +1446,12 @@ static int cpufreq_online(unsigned int cpu)
>                         goto out_destroy_policy;
>                 }
>
> +               if (cpufreq_driver->get_init_min_freq)
> +                       init_min_freq = cpufreq_driver->get_init_min_freq(policy);
> +
>                 ret = freq_qos_add_request(&policy->constraints,
>                                            policy->min_freq_req, FREQ_QOS_MIN,
> -                                          FREQ_QOS_MIN_DEFAULT_VALUE);
> +                                          init_min_freq);
>                 if (ret < 0) {
>                         /*
>                          * So we don't call freq_qos_remove_request() for an
> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
> index e0e19d9c1323..b20488b55f6c 100644
> --- a/include/linux/cpufreq.h
> +++ b/include/linux/cpufreq.h
> @@ -414,6 +414,12 @@ struct cpufreq_driver {
>          * policy is properly initialized, but before the governor is started.
>          */
>         void            (*register_em)(struct cpufreq_policy *policy);
> +
> +       /*
> +        * Set by drivers that want to initialize the policy->min_freq_req with
> +        * a value different from the default value (0) in cpufreq core.
> +        */
> +       int             (*get_init_min_freq)(struct cpufreq_policy *policy);
>  };
>
>  /* flags */
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] cpufreq: Add a callback to update the min_freq_req from drivers
  2024-10-04 18:17   ` Rafael J. Wysocki
@ 2024-10-07  4:40     ` Dhananjay Ugwekar
  2024-10-07 15:46       ` Rafael J. Wysocki
  0 siblings, 1 reply; 16+ messages in thread
From: Dhananjay Ugwekar @ 2024-10-07  4:40 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: gautham.shenoy, mario.limonciello, perry.yuan, ray.huang,
	viresh.kumar, linux-pm, linux-kernel

Hello Rafael,

On 10/4/2024 11:47 PM, Rafael J. Wysocki wrote:
> On Thu, Oct 3, 2024 at 10:44 AM Dhananjay Ugwekar
> <Dhananjay.Ugwekar@amd.com> wrote:
>>
>> Currently, there is no proper way to update the initial lower frequency
>> limit from cpufreq drivers.
> 
> Why do you want to do it?

We want to set the initial lower frequency limit at a more efficient level 
(lowest_nonlinear_freq) than the lowest frequency, which helps save power in 
some idle scenarios, and also improves benchmark results in some scenarios. 
At the same time, we want to allow the user to set the lower limit back to 
the inefficient lowest frequency.

Thanks,
Dhananjay

> 
>> Only way is to add a new min_freq qos
>> request from the driver side, but it leads to the issue explained below.
>>
>> The QoS infrastructure collates the constraints from multiple
>> subsystems and saves them in a plist. The "current value" is defined to
>> be the highest value in the plist for min_freq constraint.
>>
>> The cpufreq core adds a qos_request for min_freq to be 0 and the amd-pstate
>> driver today adds qos request for min_freq to be lowest_freq, where
>> lowest_freq corresponds to CPPC.lowest_perf.
>>
>> Eg: Suppose WLOG considering amd-pstate driver, lowest_freq is 400000 KHz,
>> lowest_non_linear_freq is 1200000 KHz.
>>
>> At this point of time, the min_freq QoS plist looks like:
>>
>> head--> 400000 KHz (registered by amd-pstate) --> 0 KHz (registered by
>> cpufreq core)
>>
>> When a user updates /sys/devices/system/cpu/cpuX/cpufreq/scaling_min_freq,
>> it only results in updating the cpufreq-core's node in the plist, where
>> say 0 becomes the newly echoed value.
>>
>> Now, if the user echoes a value 1000000 KHz, to scaling_min_freq, then the
>> new list would be
>>
>> head--> 1000000 KHz (registered by cpufreq core) --> 400000 KHz (registered
>> by amd-pstate)
>>
>> and the new "current value" of the min_freq QoS constraint will be 1000000
>> KHz, this is the scenario where it works as expected.
>>
>> Suppose we change the amd-pstate driver code's min_freq qos constraint
>> to lowest_non_linear_freq instead of lowest_freq, then the user will
>> never be able to request a value below that, due to the following:
>>
>> At boot time, the min_freq QoS plist would be
>>
>> head--> 1200000 KHz (registered by amd-pstate) --> 0 KHz (registered by
>> cpufreq core)
>>
>> When the user echoes a value of 1000000 KHz, to
>> /sys/devices/..../scaling_min_freq, then the new list would be
>>
>> head--> 1200000 KHz (registered by amd-pstate) --> 1000000 KHz (registered
>> by cpufreq core)
>>
>> with the new "current value" of the min_freq QoS remaining 1200000 KHz.
> 
> Yes, that's how frequency QoS works.
> 
>> Since the current value has not changed, there won't be any notifications
>> sent to the subsystems which have added their QoS constraints. In
>> particular, the amd-pstate driver will not get the notification, and thus,
>> the user's request to lower the scaling_min_freq will be ineffective.
> 
> The value written by user space to scaling_min_freq is a vote, not a
> request.  It may not be physically possible to reduce the frequency
> below a certain minimum level that need not be known to the user.
> 
>> Hence, it is advisable to have a single source of truth for the min and
>> max freq QoS constraints between the cpufreq and the cpufreq drivers.
>>
>> So add a new callback get_init_min_freq() add in struct cpufreq_driver,
>> which allows amd-pstate (or any other cpufreq driver) to override the
>> default min_freq value being set in the policy->min_freq_req. Now
>> scaling_min_freq can be modified by the user to any value (lower or
>> higher than the init value) later on if desired.
>>
>> Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
>> ---
>>  drivers/cpufreq/cpufreq.c | 6 +++++-
>>  include/linux/cpufreq.h   | 6 ++++++
>>  2 files changed, 11 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>> index f98c9438760c..2923068cf5f4 100644
>> --- a/drivers/cpufreq/cpufreq.c
>> +++ b/drivers/cpufreq/cpufreq.c
>> @@ -1361,6 +1361,7 @@ static int cpufreq_online(unsigned int cpu)
>>         bool new_policy;
>>         unsigned long flags;
>>         unsigned int j;
>> +       u32 init_min_freq = FREQ_QOS_MIN_DEFAULT_VALUE;
>>         int ret;
>>
>>         pr_debug("%s: bringing CPU%u online\n", __func__, cpu);
>> @@ -1445,9 +1446,12 @@ static int cpufreq_online(unsigned int cpu)
>>                         goto out_destroy_policy;
>>                 }
>>
>> +               if (cpufreq_driver->get_init_min_freq)
>> +                       init_min_freq = cpufreq_driver->get_init_min_freq(policy);
>> +
>>                 ret = freq_qos_add_request(&policy->constraints,
>>                                            policy->min_freq_req, FREQ_QOS_MIN,
>> -                                          FREQ_QOS_MIN_DEFAULT_VALUE);
>> +                                          init_min_freq);
>>                 if (ret < 0) {
>>                         /*
>>                          * So we don't call freq_qos_remove_request() for an
>> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
>> index e0e19d9c1323..b20488b55f6c 100644
>> --- a/include/linux/cpufreq.h
>> +++ b/include/linux/cpufreq.h
>> @@ -414,6 +414,12 @@ struct cpufreq_driver {
>>          * policy is properly initialized, but before the governor is started.
>>          */
>>         void            (*register_em)(struct cpufreq_policy *policy);
>> +
>> +       /*
>> +        * Set by drivers that want to initialize the policy->min_freq_req with
>> +        * a value different from the default value (0) in cpufreq core.
>> +        */
>> +       int             (*get_init_min_freq)(struct cpufreq_policy *policy);
>>  };
>>
>>  /* flags */
>> --
>> 2.34.1
>>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] cpufreq: Add a callback to update the min_freq_req from drivers
  2024-10-07  4:40     ` Dhananjay Ugwekar
@ 2024-10-07 15:46       ` Rafael J. Wysocki
  2024-10-07 15:48         ` Rafael J. Wysocki
  0 siblings, 1 reply; 16+ messages in thread
From: Rafael J. Wysocki @ 2024-10-07 15:46 UTC (permalink / raw)
  To: Dhananjay Ugwekar
  Cc: gautham.shenoy, mario.limonciello, perry.yuan, ray.huang,
	viresh.kumar, linux-pm, linux-kernel

Hi,

On Mon, Oct 7, 2024 at 6:40 AM Dhananjay Ugwekar
<Dhananjay.Ugwekar@amd.com> wrote:
>
> Hello Rafael,
>
> On 10/4/2024 11:47 PM, Rafael J. Wysocki wrote:
> > On Thu, Oct 3, 2024 at 10:44 AM Dhananjay Ugwekar
> > <Dhananjay.Ugwekar@amd.com> wrote:
> >>
> >> Currently, there is no proper way to update the initial lower frequency
> >> limit from cpufreq drivers.
> >
> > Why do you want to do it?
>
> We want to set the initial lower frequency limit at a more efficient level
> (lowest_nonlinear_freq) than the lowest frequency, which helps save power in
> some idle scenarios, and also improves benchmark results in some scenarios.
> At the same time, we want to allow the user to set the lower limit back to
> the inefficient lowest frequency.

So you want the default value of scaling_min_freq to be greater than
the total floor.

I have to say that I'm not particularly fond of this approach because
it is adding a new meaning to scaling_min_freq: Setting it below the
default would not cause the driver to use inefficient frequencies
which user space may not be aware of.  Moreover, it would tell the
driver how far it could go with that.

IMV it would be bettwr to have a separate interface for this kind of tuning.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] cpufreq: Add a callback to update the min_freq_req from drivers
  2024-10-07 15:46       ` Rafael J. Wysocki
@ 2024-10-07 15:48         ` Rafael J. Wysocki
  2024-10-08  6:32           ` Dhananjay Ugwekar
  0 siblings, 1 reply; 16+ messages in thread
From: Rafael J. Wysocki @ 2024-10-07 15:48 UTC (permalink / raw)
  To: Dhananjay Ugwekar
  Cc: gautham.shenoy, mario.limonciello, perry.yuan, ray.huang,
	viresh.kumar, linux-pm, linux-kernel

On Mon, Oct 7, 2024 at 5:46 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> Hi,
>
> On Mon, Oct 7, 2024 at 6:40 AM Dhananjay Ugwekar
> <Dhananjay.Ugwekar@amd.com> wrote:
> >
> > Hello Rafael,
> >
> > On 10/4/2024 11:47 PM, Rafael J. Wysocki wrote:
> > > On Thu, Oct 3, 2024 at 10:44 AM Dhananjay Ugwekar
> > > <Dhananjay.Ugwekar@amd.com> wrote:
> > >>
> > >> Currently, there is no proper way to update the initial lower frequency
> > >> limit from cpufreq drivers.
> > >
> > > Why do you want to do it?
> >
> > We want to set the initial lower frequency limit at a more efficient level
> > (lowest_nonlinear_freq) than the lowest frequency, which helps save power in
> > some idle scenarios, and also improves benchmark results in some scenarios.
> > At the same time, we want to allow the user to set the lower limit back to
> > the inefficient lowest frequency.
>
> So you want the default value of scaling_min_freq to be greater than
> the total floor.
>
> I have to say that I'm not particularly fond of this approach because
> it is adding a new meaning to scaling_min_freq: Setting it below the
> default would not cause the driver to use inefficient frequencies

s/not/now/ (sorry)

I should have double checked this before sending.

> which user space may not be aware of.  Moreover, it would tell the
> driver how far it could go with that.
>
> IMV it would be bettwr to have a separate interface for this kind of tuning.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] cpufreq: Add a callback to update the min_freq_req from drivers
  2024-10-07 15:48         ` Rafael J. Wysocki
@ 2024-10-08  6:32           ` Dhananjay Ugwekar
  2024-10-10 17:57             ` Rafael J. Wysocki
  0 siblings, 1 reply; 16+ messages in thread
From: Dhananjay Ugwekar @ 2024-10-08  6:32 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: gautham.shenoy, mario.limonciello, perry.yuan, ray.huang,
	viresh.kumar, linux-pm, linux-kernel

Hello Rafael,

On 10/7/2024 9:18 PM, Rafael J. Wysocki wrote:
> On Mon, Oct 7, 2024 at 5:46 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>>
>> Hi,
>>
>> On Mon, Oct 7, 2024 at 6:40 AM Dhananjay Ugwekar
>> <Dhananjay.Ugwekar@amd.com> wrote:
>>>
>>> Hello Rafael,
>>>
>>> On 10/4/2024 11:47 PM, Rafael J. Wysocki wrote:
>>>> On Thu, Oct 3, 2024 at 10:44 AM Dhananjay Ugwekar
>>>> <Dhananjay.Ugwekar@amd.com> wrote:
>>>>>
>>>>> Currently, there is no proper way to update the initial lower frequency
>>>>> limit from cpufreq drivers.
>>>>
>>>> Why do you want to do it?
>>>
>>> We want to set the initial lower frequency limit at a more efficient level
>>> (lowest_nonlinear_freq) than the lowest frequency, which helps save power in
>>> some idle scenarios, and also improves benchmark results in some scenarios.
>>> At the same time, we want to allow the user to set the lower limit back to
>>> the inefficient lowest frequency.
>>
>> So you want the default value of scaling_min_freq to be greater than
>> the total floor.

Yes, we want to set the default min value to what we think is best for the platform.

>>
>> I have to say that I'm not particularly fond of this approach because
>> it is adding a new meaning to scaling_min_freq: Setting it below the
>> default would not cause the driver to use inefficient frequencies
> 
> s/not/now/ (sorry)

I believe we are not changing the meaning of the scaling_min_freq just setting it 
to the best value at boot and then allowing the user to have access to the entire 
frequency range for the platform. Also, we have cpuinfo_min_freq/max_freq to 
indicate to the user as to what the entire frequency range is for the platform 
(depending on boost enabled/disabled).

> 
> I should have double checked this before sending.
> 
>> which user space may not be aware of.

I guess, this part we can fix by documenting correctly ?

>> Moreover, it would tell the
>> driver how far it could go with that.

Sorry, I didnt understand this part.

>>
>> IMV it would be bettwr to have a separate interface for this kind of tuning.

I feel like we can incorporate this change cleanly enough into scaling_min_freq, 
without adding a new interface which might further confuse the user. But please 
let me know your concerns and thoughts.

Thanks,
Dhananjay

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] cpufreq: Add a callback to update the min_freq_req from drivers
  2024-10-03  8:39 ` [PATCH 1/3] cpufreq: Add a callback to update the min_freq_req from drivers Dhananjay Ugwekar
  2024-10-04  6:41   ` Gautham R. Shenoy
  2024-10-04 18:17   ` Rafael J. Wysocki
@ 2024-10-10  7:35   ` Viresh Kumar
  2024-10-10  9:54     ` Dhananjay Ugwekar
  2 siblings, 1 reply; 16+ messages in thread
From: Viresh Kumar @ 2024-10-10  7:35 UTC (permalink / raw)
  To: Dhananjay Ugwekar
  Cc: gautham.shenoy, mario.limonciello, perry.yuan, ray.huang, rafael,
	linux-pm, linux-kernel

On 03-10-24, 08:39, Dhananjay Ugwekar wrote:
> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
> index e0e19d9c1323..b20488b55f6c 100644
> --- a/include/linux/cpufreq.h
> +++ b/include/linux/cpufreq.h
> @@ -414,6 +414,12 @@ struct cpufreq_driver {
>  	 * policy is properly initialized, but before the governor is started.
>  	 */
>  	void		(*register_em)(struct cpufreq_policy *policy);
> +
> +	/*
> +	 * Set by drivers that want to initialize the policy->min_freq_req with
> +	 * a value different from the default value (0) in cpufreq core.
> +	 */
> +	int		(*get_init_min_freq)(struct cpufreq_policy *policy);
>  };

Apart from Rafael's concern, I don't see why you need to define a callback for
something this basic. If we are going to make this change, why can't we just add
another u64 field in policy's structure which gives you the freq directly ?

-- 
viresh

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] cpufreq: Add a callback to update the min_freq_req from drivers
  2024-10-10  7:35   ` Viresh Kumar
@ 2024-10-10  9:54     ` Dhananjay Ugwekar
  2024-10-10 10:07       ` Viresh Kumar
  0 siblings, 1 reply; 16+ messages in thread
From: Dhananjay Ugwekar @ 2024-10-10  9:54 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: gautham.shenoy, mario.limonciello, perry.yuan, ray.huang, rafael,
	linux-pm, linux-kernel

Hello Viresh,

On 10/10/2024 1:05 PM, Viresh Kumar wrote:
> On 03-10-24, 08:39, Dhananjay Ugwekar wrote:
>> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
>> index e0e19d9c1323..b20488b55f6c 100644
>> --- a/include/linux/cpufreq.h
>> +++ b/include/linux/cpufreq.h
>> @@ -414,6 +414,12 @@ struct cpufreq_driver {
>>  	 * policy is properly initialized, but before the governor is started.
>>  	 */
>>  	void		(*register_em)(struct cpufreq_policy *policy);
>> +
>> +	/*
>> +	 * Set by drivers that want to initialize the policy->min_freq_req with
>> +	 * a value different from the default value (0) in cpufreq core.
>> +	 */
>> +	int		(*get_init_min_freq)(struct cpufreq_policy *policy);
>>  };
> 
> Apart from Rafael's concern, I don't see why you need to define a callback for
> something this basic. If we are going to make this change, why can't we just add
> another u64 field in policy's structure which gives you the freq directly ?

Sure, that also works for us, if it is okay with you and Rafael, I can add the 
u64 field in policy struct.

Thanks,
Dhananjay

> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] cpufreq: Add a callback to update the min_freq_req from drivers
  2024-10-10  9:54     ` Dhananjay Ugwekar
@ 2024-10-10 10:07       ` Viresh Kumar
  0 siblings, 0 replies; 16+ messages in thread
From: Viresh Kumar @ 2024-10-10 10:07 UTC (permalink / raw)
  To: Dhananjay Ugwekar
  Cc: gautham.shenoy, mario.limonciello, perry.yuan, ray.huang, rafael,
	linux-pm, linux-kernel

On 10-10-24, 15:24, Dhananjay Ugwekar wrote:
> Sure, that also works for us, if it is okay with you and Rafael, I can add the 
> u64 field in policy struct.

Let Rafael continue the discussion on the whole idea first, if we are aligned,
then you can make this change.

-- 
viresh

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] cpufreq: Add a callback to update the min_freq_req from drivers
  2024-10-08  6:32           ` Dhananjay Ugwekar
@ 2024-10-10 17:57             ` Rafael J. Wysocki
  0 siblings, 0 replies; 16+ messages in thread
From: Rafael J. Wysocki @ 2024-10-10 17:57 UTC (permalink / raw)
  To: Dhananjay Ugwekar
  Cc: Rafael J. Wysocki, gautham.shenoy, mario.limonciello, perry.yuan,
	ray.huang, viresh.kumar, linux-pm, linux-kernel

Hi,

On Tue, Oct 8, 2024 at 8:32 AM Dhananjay Ugwekar
<Dhananjay.Ugwekar@amd.com> wrote:
>
> Hello Rafael,
>
> On 10/7/2024 9:18 PM, Rafael J. Wysocki wrote:
> > On Mon, Oct 7, 2024 at 5:46 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
> >>
> >> Hi,
> >>
> >> On Mon, Oct 7, 2024 at 6:40 AM Dhananjay Ugwekar
> >> <Dhananjay.Ugwekar@amd.com> wrote:
> >>>
> >>> Hello Rafael,
> >>>
> >>> On 10/4/2024 11:47 PM, Rafael J. Wysocki wrote:
> >>>> On Thu, Oct 3, 2024 at 10:44 AM Dhananjay Ugwekar
> >>>> <Dhananjay.Ugwekar@amd.com> wrote:
> >>>>>
> >>>>> Currently, there is no proper way to update the initial lower frequency
> >>>>> limit from cpufreq drivers.
> >>>>
> >>>> Why do you want to do it?
> >>>
> >>> We want to set the initial lower frequency limit at a more efficient level
> >>> (lowest_nonlinear_freq) than the lowest frequency, which helps save power in
> >>> some idle scenarios, and also improves benchmark results in some scenarios.
> >>> At the same time, we want to allow the user to set the lower limit back to
> >>> the inefficient lowest frequency.
> >>
> >> So you want the default value of scaling_min_freq to be greater than
> >> the total floor.
>
> Yes, we want to set the default min value to what we think is best for the platform.
>
> >>
> >> I have to say that I'm not particularly fond of this approach because
> >> it is adding a new meaning to scaling_min_freq: Setting it below the
> >> default would not cause the driver to use inefficient frequencies
> >
> > s/not/now/ (sorry)
>
> I believe we are not changing the meaning of the scaling_min_freq just setting it
> to the best value at boot and then allowing the user to have access to the entire
> frequency range for the platform. Also, we have cpuinfo_min_freq/max_freq to
> indicate to the user as to what the entire frequency range is for the platform
> (depending on boost enabled/disabled).

But the default you want to set is the lowest efficient frequency
which user space needs to be aware of.  Otherwise it may make
suboptimal decisions.

> >
> > I should have double checked this before sending.
> >
> >> which user space may not be aware of.
>
> I guess, this part we can fix by documenting correctly ?
>
> >> Moreover, it would tell the
> >> driver how far it could go with that.
>
> Sorry, I didnt understand this part.

Sorry, never mind.  I was just repeating myself.

> >>
> >> IMV it would be bettwr to have a separate interface for this kind of tuning.
>
> I feel like we can incorporate this change cleanly enough into scaling_min_freq,
> without adding a new interface which might further confuse the user. But please
> let me know your concerns and thoughts.

I'm not sure if you realize that the .show() operation for
scaling_min_freq prints policy->max and you can easily make your
driver's .verify() callback change it to whatever value is desired.
You may as well set it to the lowest efficient frequency if the one
passed to you is equal to FREQ_QOS_MIN_DEFAULT_VALUE.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2024-10-10 17:57 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-03  8:39 [PATCH 0/3] cpufreq/amd-pstate: Set initial min_freq to lowest_nonlinear_freq Dhananjay Ugwekar
2024-10-03  8:39 ` [PATCH 1/3] cpufreq: Add a callback to update the min_freq_req from drivers Dhananjay Ugwekar
2024-10-04  6:41   ` Gautham R. Shenoy
2024-10-04 18:17   ` Rafael J. Wysocki
2024-10-07  4:40     ` Dhananjay Ugwekar
2024-10-07 15:46       ` Rafael J. Wysocki
2024-10-07 15:48         ` Rafael J. Wysocki
2024-10-08  6:32           ` Dhananjay Ugwekar
2024-10-10 17:57             ` Rafael J. Wysocki
2024-10-10  7:35   ` Viresh Kumar
2024-10-10  9:54     ` Dhananjay Ugwekar
2024-10-10 10:07       ` Viresh Kumar
2024-10-03  8:39 ` [PATCH 2/3] cpufreq/amd-pstate: Set the initial min_freq to lowest_nonlinear_freq Dhananjay Ugwekar
2024-10-04  8:47   ` Gautham R. Shenoy
2024-10-03  8:39 ` [PATCH 3/3] cpufreq/amd-pstate: Cleanup the old min_freq qos request remnants Dhananjay Ugwekar
2024-10-03 19:23 ` [PATCH 0/3] cpufreq/amd-pstate: Set initial min_freq to lowest_nonlinear_freq Mario Limonciello

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox