linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V5 0/4] Reduce the intel_pstate timer overhead
@ 2016-03-06  7:34 Philippe Longepe
  2016-03-06  7:34 ` [PATCH V5 1/4] Remove extra conversions in pid calculation Philippe Longepe
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Philippe Longepe @ 2016-03-06  7:34 UTC (permalink / raw)
  To: linux-pm; +Cc: srinivas.pandruvada, rafael

This serie includes the following code optimizations:

Patch 1: Remove extra conversions in pid calculation
Patch 2: Optimize calculation for max/min_perf_adj
Patch 3: Move the intel_pstate_calc_busy into get_target_pstate_use_performance
Patch 4: Remove the freq calculation from the intel_pstate_calc_busy function

Philippe Longepe (4):
  Remove extra conversions in pid calculation
  Optimize calculation for max/min_perf_adj
  Move the intel_pstate_calc_busy into get_target_pstate_use_performance
  intel_pstate: Remove the freq calculation from the
    intel_pstate_calc_busy function

 drivers/cpufreq/intel_pstate.c | 33 ++++++++++++++++-----------------
 1 file changed, 16 insertions(+), 17 deletions(-)

-- 
1.9.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH V5 1/4] Remove extra conversions in pid calculation
  2016-03-06  7:34 [PATCH V5 0/4] Reduce the intel_pstate timer overhead Philippe Longepe
@ 2016-03-06  7:34 ` Philippe Longepe
  2016-03-07 21:30   ` Srinivas Pandruvada
  2016-03-06  7:34 ` [PATCH V5 2/4] Optimize calculation for max/min_perf_adj Philippe Longepe
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: Philippe Longepe @ 2016-03-06  7:34 UTC (permalink / raw)
  To: linux-pm; +Cc: srinivas.pandruvada, rafael

pid->setpoint and pid->deadband can be initialize in float so we
can remove the int_tofp in pid_calc.

Signed-off-by: Philippe Longepe <philippe.longepe@linux.intel.com>
---
 drivers/cpufreq/intel_pstate.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index eb0aef0..114e4e0 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -198,8 +198,8 @@ static struct perf_limits *limits = &powersave_limits;
 
 static inline void pid_reset(struct _pid *pid, int setpoint, int busy,
 			     int deadband, int integral) {
-	pid->setpoint = setpoint;
-	pid->deadband  = deadband;
+	pid->setpoint = int_tofp(setpoint);
+	pid->deadband  = int_tofp(deadband);
 	pid->integral  = int_tofp(integral);
 	pid->last_err  = int_tofp(setpoint) - int_tofp(busy);
 }
@@ -225,9 +225,9 @@ static signed int pid_calc(struct _pid *pid, int32_t busy)
 	int32_t pterm, dterm, fp_error;
 	int32_t integral_limit;
 
-	fp_error = int_tofp(pid->setpoint) - busy;
+	fp_error = pid->setpoint - busy;
 
-	if (abs(fp_error) <= int_tofp(pid->deadband))
+	if (abs(fp_error) <= pid->deadband)
 		return 0;
 
 	pterm = mul_fp(pid->p_gain, fp_error);
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH V5 2/4] Optimize calculation for max/min_perf_adj
  2016-03-06  7:34 [PATCH V5 0/4] Reduce the intel_pstate timer overhead Philippe Longepe
  2016-03-06  7:34 ` [PATCH V5 1/4] Remove extra conversions in pid calculation Philippe Longepe
@ 2016-03-06  7:34 ` Philippe Longepe
  2016-03-07 21:35   ` Srinivas Pandruvada
  2016-03-06  7:34 ` [PATCH V5 3/4] Move the intel_pstate_calc_busy into get_target_pstate_use_performance Philippe Longepe
  2016-03-06  7:34 ` [PATCH V5 4/4] intel_pstate: Remove the freq calculation from the intel_pstate_calc_busy function Philippe Longepe
  3 siblings, 1 reply; 9+ messages in thread
From: Philippe Longepe @ 2016-03-06  7:34 UTC (permalink / raw)
  To: linux-pm; +Cc: srinivas.pandruvada, rafael

mul_fp(int_tofp(A), B) expands to:
((A << FRAC_BITS) * B) >> FRAC_BITS, so the same result can be obtained
via simple multiplication A * B.  Apply this observation to
max_perf * limits->max_perf and max_perf * limits->min_perf in
intel_pstate_get_min_max()."

Signed-off-by: Philippe Longepe <philippe.longepe@linux.intel.com>
---
 drivers/cpufreq/intel_pstate.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index 114e4e0..c46d23a 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -828,11 +828,11 @@ static void intel_pstate_get_min_max(struct cpudata *cpu, int *min, int *max)
 	 * policy, or by cpu specific default values determined through
 	 * experimentation.
 	 */
-	max_perf_adj = fp_toint(mul_fp(int_tofp(max_perf), limits->max_perf));
+	max_perf_adj = fp_toint(max_perf * limits->max_perf);
 	*max = clamp_t(int, max_perf_adj,
 			cpu->pstate.min_pstate, cpu->pstate.turbo_pstate);
 
-	min_perf = fp_toint(mul_fp(int_tofp(max_perf), limits->min_perf));
+	min_perf = fp_toint(max_perf * limits->min_perf);
 	*min = clamp_t(int, min_perf, cpu->pstate.min_pstate, max_perf);
 }
 
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH V5 3/4] Move the intel_pstate_calc_busy into get_target_pstate_use_performance
  2016-03-06  7:34 [PATCH V5 0/4] Reduce the intel_pstate timer overhead Philippe Longepe
  2016-03-06  7:34 ` [PATCH V5 1/4] Remove extra conversions in pid calculation Philippe Longepe
  2016-03-06  7:34 ` [PATCH V5 2/4] Optimize calculation for max/min_perf_adj Philippe Longepe
@ 2016-03-06  7:34 ` Philippe Longepe
  2016-03-07 21:36   ` Srinivas Pandruvada
  2016-03-06  7:34 ` [PATCH V5 4/4] intel_pstate: Remove the freq calculation from the intel_pstate_calc_busy function Philippe Longepe
  3 siblings, 1 reply; 9+ messages in thread
From: Philippe Longepe @ 2016-03-06  7:34 UTC (permalink / raw)
  To: linux-pm; +Cc: srinivas.pandruvada, rafael

The cpu_load algorithm doesn't need to invoke intel_pstate_calc_busy(),
so move that call from intel_pstate_sample() to
get_target_pstate_use_performance().

Signed-off-by: Philippe Longepe <philippe.longepe@linux.intel.com>
---
 drivers/cpufreq/intel_pstate.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index c46d23a..903341f 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -912,8 +912,6 @@ static inline void intel_pstate_sample(struct cpudata *cpu, u64 time)
 	cpu->sample.mperf -= cpu->prev_mperf;
 	cpu->sample.tsc -= cpu->prev_tsc;
 
-	intel_pstate_calc_busy(cpu);
-
 	cpu->prev_aperf = aperf;
 	cpu->prev_mperf = mperf;
 	cpu->prev_tsc = tsc;
@@ -942,7 +940,6 @@ static inline int32_t get_target_pstate_use_cpu_load(struct cpudata *cpu)
 	mperf = cpu->sample.mperf + delta_iowait_mperf;
 	cpu->prev_cummulative_iowait = cummulative_iowait;
 
-
 	/*
 	 * The load can be estimated as the ratio of the mperf counter
 	 * running at a constant frequency during active periods
@@ -960,6 +957,8 @@ static inline int32_t get_target_pstate_use_performance(struct cpudata *cpu)
 	int32_t core_busy, max_pstate, current_pstate, sample_ratio;
 	u64 duration_ns;
 
+	intel_pstate_calc_busy(cpu);
+
 	/*
 	 * core_busy is the ratio of actual performance to max
 	 * max_pstate is the max non turbo pstate available
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH V5 4/4] intel_pstate: Remove the freq calculation from the intel_pstate_calc_busy function
  2016-03-06  7:34 [PATCH V5 0/4] Reduce the intel_pstate timer overhead Philippe Longepe
                   ` (2 preceding siblings ...)
  2016-03-06  7:34 ` [PATCH V5 3/4] Move the intel_pstate_calc_busy into get_target_pstate_use_performance Philippe Longepe
@ 2016-03-06  7:34 ` Philippe Longepe
  2016-03-07 21:42   ` Srinivas Pandruvada
  3 siblings, 1 reply; 9+ messages in thread
From: Philippe Longepe @ 2016-03-06  7:34 UTC (permalink / raw)
  To: linux-pm; +Cc: srinivas.pandruvada, rafael

Use a helper function to compute the average pstate and call it only
where it is needed (only when tracing or in intel_pstate_get).

Signed-off-by: Philippe Longepe <philippe.longepe@linux.intel.com>
---
 drivers/cpufreq/intel_pstate.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index 903341f..6e07366 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -878,12 +878,6 @@ static inline void intel_pstate_calc_busy(struct cpudata *cpu)
 	core_pct = int_tofp(sample->aperf) * int_tofp(100);
 	core_pct = div64_u64(core_pct, int_tofp(sample->mperf));
 
-	sample->freq = fp_toint(
-		mul_fp(int_tofp(
-			cpu->pstate.max_pstate_physical *
-			cpu->pstate.scaling / 100),
-			core_pct));
-
 	sample->core_pct_busy = (int32_t)core_pct;
 }
 
@@ -917,6 +911,12 @@ static inline void intel_pstate_sample(struct cpudata *cpu, u64 time)
 	cpu->prev_tsc = tsc;
 }
 
+static inline int32_t get_avg_frequency(struct cpudata *cpu)
+{
+	return div64_u64(cpu->pstate.max_pstate_physical * cpu->sample.aperf *
+		cpu->pstate.scaling, cpu->sample.mperf);
+}
+
 static inline int32_t get_target_pstate_use_cpu_load(struct cpudata *cpu)
 {
 	struct sample *sample = &cpu->sample;
@@ -1012,7 +1012,7 @@ static inline void intel_pstate_adjust_busy_pstate(struct cpudata *cpu)
 		sample->mperf,
 		sample->aperf,
 		sample->tsc,
-		sample->freq);
+		get_avg_frequency(cpu));
 }
 
 static void intel_pstate_update_util(struct update_util_data *data, u64 time,
@@ -1101,7 +1101,7 @@ static unsigned int intel_pstate_get(unsigned int cpu_num)
 	if (!cpu)
 		return 0;
 	sample = &cpu->sample;
-	return sample->freq;
+	return get_avg_frequency(cpu);
 }
 
 static int intel_pstate_set_policy(struct cpufreq_policy *policy)
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH V5 1/4] Remove extra conversions in pid calculation
  2016-03-06  7:34 ` [PATCH V5 1/4] Remove extra conversions in pid calculation Philippe Longepe
@ 2016-03-07 21:30   ` Srinivas Pandruvada
  0 siblings, 0 replies; 9+ messages in thread
From: Srinivas Pandruvada @ 2016-03-07 21:30 UTC (permalink / raw)
  To: Philippe Longepe, linux-pm; +Cc: rafael

On Sun, 2016-03-06 at 08:34 +0100, Philippe Longepe wrote:
> pid->setpoint and pid->deadband can be initialize in float so we
> can remove the int_tofp in pid_calc.

Rafael had some comments on this patch:
"s/initialize/initialized/

> can remove the int_tofp in pid_calc.

This is not "float", but "fixed point".

Also "avoid" rather than "remove".
"
Not sure if Rafael is OK without change.

> Signed-off-by: Philippe Longepe <philippe.longepe@linux.intel.com>
> ---
>  drivers/cpufreq/intel_pstate.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/cpufreq/intel_pstate.c
> b/drivers/cpufreq/intel_pstate.c
> index eb0aef0..114e4e0 100644
> --- a/drivers/cpufreq/intel_pstate.c
> +++ b/drivers/cpufreq/intel_pstate.c
> @@ -198,8 +198,8 @@ static struct perf_limits *limits =
> &powersave_limits;
>  
>  static inline void pid_reset(struct _pid *pid, int setpoint, int
> busy,
>  			     int deadband, int integral) {
> -	pid->setpoint = setpoint;
> -	pid->deadband  = deadband;
> +	pid->setpoint = int_tofp(setpoint);
> +	pid->deadband  = int_tofp(deadband);
>  	pid->integral  = int_tofp(integral);
>  	pid->last_err  = int_tofp(setpoint) - int_tofp(busy);
>  }
> @@ -225,9 +225,9 @@ static signed int pid_calc(struct _pid *pid,
> int32_t busy)
>  	int32_t pterm, dterm, fp_error;
>  	int32_t integral_limit;
>  
> -	fp_error = int_tofp(pid->setpoint) - busy;
> +	fp_error = pid->setpoint - busy;
>  
> -	if (abs(fp_error) <= int_tofp(pid->deadband))
> +	if (abs(fp_error) <= pid->deadband)
>  		return 0;
>  
>  	pterm = mul_fp(pid->p_gain, fp_error);

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH V5 2/4] Optimize calculation for max/min_perf_adj
  2016-03-06  7:34 ` [PATCH V5 2/4] Optimize calculation for max/min_perf_adj Philippe Longepe
@ 2016-03-07 21:35   ` Srinivas Pandruvada
  0 siblings, 0 replies; 9+ messages in thread
From: Srinivas Pandruvada @ 2016-03-07 21:35 UTC (permalink / raw)
  To: Philippe Longepe, linux-pm; +Cc: rafael

On Sun, 2016-03-06 at 08:34 +0100, Philippe Longepe wrote:
> mul_fp(int_tofp(A), B) expands to:
> ((A << FRAC_BITS) * B) >> FRAC_BITS, so the same result can be
> obtained
> via simple multiplication A * B.  Apply this observation to
> max_perf * limits->max_perf and max_perf * limits->min_perf in
> intel_pstate_get_min_max()."
> 
> Signed-off-by: Philippe Longepe <philippe.longepe@linux.intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>

> ---
>  drivers/cpufreq/intel_pstate.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/cpufreq/intel_pstate.c
> b/drivers/cpufreq/intel_pstate.c
> index 114e4e0..c46d23a 100644
> --- a/drivers/cpufreq/intel_pstate.c
> +++ b/drivers/cpufreq/intel_pstate.c
> @@ -828,11 +828,11 @@ static void intel_pstate_get_min_max(struct
> cpudata *cpu, int *min, int *max)
>  	 * policy, or by cpu specific default values determined
> through
>  	 * experimentation.
>  	 */
> -	max_perf_adj = fp_toint(mul_fp(int_tofp(max_perf), limits-
> >max_perf));
> +	max_perf_adj = fp_toint(max_perf * limits->max_perf);
>  	*max = clamp_t(int, max_perf_adj,
>  			cpu->pstate.min_pstate, cpu-
> >pstate.turbo_pstate);
>  
> -	min_perf = fp_toint(mul_fp(int_tofp(max_perf), limits-
> >min_perf));
> +	min_perf = fp_toint(max_perf * limits->min_perf);
>  	*min = clamp_t(int, min_perf, cpu->pstate.min_pstate,
> max_perf);
>  }
>  

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH V5 3/4] Move the intel_pstate_calc_busy into get_target_pstate_use_performance
  2016-03-06  7:34 ` [PATCH V5 3/4] Move the intel_pstate_calc_busy into get_target_pstate_use_performance Philippe Longepe
@ 2016-03-07 21:36   ` Srinivas Pandruvada
  0 siblings, 0 replies; 9+ messages in thread
From: Srinivas Pandruvada @ 2016-03-07 21:36 UTC (permalink / raw)
  To: Philippe Longepe, linux-pm; +Cc: rafael

On Sun, 2016-03-06 at 08:34 +0100, Philippe Longepe wrote:
> The cpu_load algorithm doesn't need to invoke
> intel_pstate_calc_busy(),
> so move that call from intel_pstate_sample() to
> get_target_pstate_use_performance().
> 
> Signed-off-by: Philippe Longepe <philippe.longepe@linux.intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>

> ---
>  drivers/cpufreq/intel_pstate.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/cpufreq/intel_pstate.c
> b/drivers/cpufreq/intel_pstate.c
> index c46d23a..903341f 100644
> --- a/drivers/cpufreq/intel_pstate.c
> +++ b/drivers/cpufreq/intel_pstate.c
> @@ -912,8 +912,6 @@ static inline void intel_pstate_sample(struct
> cpudata *cpu, u64 time)
>  	cpu->sample.mperf -= cpu->prev_mperf;
>  	cpu->sample.tsc -= cpu->prev_tsc;
>  
> -	intel_pstate_calc_busy(cpu);
> -
>  	cpu->prev_aperf = aperf;
>  	cpu->prev_mperf = mperf;
>  	cpu->prev_tsc = tsc;
> @@ -942,7 +940,6 @@ static inline int32_t
> get_target_pstate_use_cpu_load(struct cpudata *cpu)
>  	mperf = cpu->sample.mperf + delta_iowait_mperf;
>  	cpu->prev_cummulative_iowait = cummulative_iowait;
>  
> -
>  	/*
>  	 * The load can be estimated as the ratio of the mperf
> counter
>  	 * running at a constant frequency during active periods
> @@ -960,6 +957,8 @@ static inline int32_t
> get_target_pstate_use_performance(struct cpudata *cpu)
>  	int32_t core_busy, max_pstate, current_pstate, sample_ratio;
>  	u64 duration_ns;
>  
> +	intel_pstate_calc_busy(cpu);
> +
>  	/*
>  	 * core_busy is the ratio of actual performance to max
>  	 * max_pstate is the max non turbo pstate available

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH V5 4/4] intel_pstate: Remove the freq calculation from the intel_pstate_calc_busy function
  2016-03-06  7:34 ` [PATCH V5 4/4] intel_pstate: Remove the freq calculation from the intel_pstate_calc_busy function Philippe Longepe
@ 2016-03-07 21:42   ` Srinivas Pandruvada
  0 siblings, 0 replies; 9+ messages in thread
From: Srinivas Pandruvada @ 2016-03-07 21:42 UTC (permalink / raw)
  To: Philippe Longepe, linux-pm; +Cc: rafael

On Sun, 2016-03-06 at 08:34 +0100, Philippe Longepe wrote:
> Use a helper function to compute the average pstate and call it only
> where it is needed (only when tracing or in intel_pstate_get).
> 
> Signed-off-by: Philippe Longepe <philippe.longepe@linux.intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
> ---
>  drivers/cpufreq/intel_pstate.c | 16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/cpufreq/intel_pstate.c
> b/drivers/cpufreq/intel_pstate.c
> index 903341f..6e07366 100644
> --- a/drivers/cpufreq/intel_pstate.c
> +++ b/drivers/cpufreq/intel_pstate.c
> @@ -878,12 +878,6 @@ static inline void intel_pstate_calc_busy(struct
> cpudata *cpu)
>  	core_pct = int_tofp(sample->aperf) * int_tofp(100);
>  	core_pct = div64_u64(core_pct, int_tofp(sample->mperf));
>  
> -	sample->freq = fp_toint(
> -		mul_fp(int_tofp(
> -			cpu->pstate.max_pstate_physical *
> -			cpu->pstate.scaling / 100),
> -			core_pct));
> -
>  	sample->core_pct_busy = (int32_t)core_pct;
>  }
>  
> @@ -917,6 +911,12 @@ static inline void intel_pstate_sample(struct
> cpudata *cpu, u64 time)
>  	cpu->prev_tsc = tsc;
>  }
>  
> +static inline int32_t get_avg_frequency(struct cpudata *cpu)
> +{
> +	return div64_u64(cpu->pstate.max_pstate_physical * cpu-
> >sample.aperf *
> +		cpu->pstate.scaling, cpu->sample.mperf);
> +}
> +
>  static inline int32_t get_target_pstate_use_cpu_load(struct cpudata
> *cpu)
>  {
>  	struct sample *sample = &cpu->sample;
> @@ -1012,7 +1012,7 @@ static inline void
> intel_pstate_adjust_busy_pstate(struct cpudata *cpu)
>  		sample->mperf,
>  		sample->aperf,
>  		sample->tsc,
> -		sample->freq);
> +		get_avg_frequency(cpu));
>  }
>  
>  static void intel_pstate_update_util(struct update_util_data *data,
> u64 time,
> @@ -1101,7 +1101,7 @@ static unsigned int intel_pstate_get(unsigned
> int cpu_num)
>  	if (!cpu)
>  		return 0;
>  	sample = &cpu->sample;
> -	return sample->freq;
> +	return get_avg_frequency(cpu);
>  }
>  
>  static int intel_pstate_set_policy(struct cpufreq_policy *policy)

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-03-07 21:44 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-03-06  7:34 [PATCH V5 0/4] Reduce the intel_pstate timer overhead Philippe Longepe
2016-03-06  7:34 ` [PATCH V5 1/4] Remove extra conversions in pid calculation Philippe Longepe
2016-03-07 21:30   ` Srinivas Pandruvada
2016-03-06  7:34 ` [PATCH V5 2/4] Optimize calculation for max/min_perf_adj Philippe Longepe
2016-03-07 21:35   ` Srinivas Pandruvada
2016-03-06  7:34 ` [PATCH V5 3/4] Move the intel_pstate_calc_busy into get_target_pstate_use_performance Philippe Longepe
2016-03-07 21:36   ` Srinivas Pandruvada
2016-03-06  7:34 ` [PATCH V5 4/4] intel_pstate: Remove the freq calculation from the intel_pstate_calc_busy function Philippe Longepe
2016-03-07 21:42   ` Srinivas Pandruvada

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).