From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Stratos Karafotis <stratosk@semaphore.gr>,
Viresh Kumar <viresh.kumar@linaro.org>,
"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
Mark Brown <broonie@kernel.org>
Subject: [PATCH 3.10 13/13] cpufreq: ondemand: Change the calculation of target frequency
Date: Tue, 7 Oct 2014 16:20:18 -0700 [thread overview]
Message-ID: <20141007231920.314796030@linuxfoundation.org> (raw)
In-Reply-To: <20141007231919.924479934@linuxfoundation.org>
3.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stratos Karafotis <stratosk@semaphore.gr>
commit dfa5bb622555d9da0df21b50f46ebdeef390041b upstream.
The ondemand governor calculates load in terms of frequency and
increases it only if load_freq is greater than up_threshold
multiplied by the current or average frequency. This appears to
produce oscillations of frequency between min and max because,
for example, a relatively small load can easily saturate minimum
frequency and lead the CPU to the max. Then, it will decrease
back to the min due to small load_freq.
Change the calculation method of load and target frequency on the
basis of the following two observations:
- Load computation should not depend on the current or average
measured frequency. For example, absolute load of 80% at 100MHz
is not necessarily equivalent to 8% at 1000MHz in the next
sampling interval.
- It should be possible to increase the target frequency to any
value present in the frequency table proportional to the absolute
load, rather than to the max only, so that:
Target frequency = C * load
where we take C = policy->cpuinfo.max_freq / 100.
Tested on Intel i7-3770 CPU @ 3.40GHz and on Quad core 1500MHz Krait.
Phoronix benchmark of Linux Kernel Compilation 3.1 test shows an
increase ~1.5% in performance. cpufreq_stats (time_in_state) shows
that middle frequencies are used more, with this patch. Highest
and lowest frequencies were used less by ~9%.
[rjw: We have run multiple other tests on kernels with this
change applied and in the vast majority of cases it turns out
that the resulting performance improvement also leads to reduced
consumption of energy. The change is additionally justified by
the overall simplification of the code in question.]
Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/cpufreq/cpufreq_governor.c | 10 ---------
drivers/cpufreq/cpufreq_governor.h | 1
drivers/cpufreq/cpufreq_ondemand.c | 39 ++++++-------------------------------
3 files changed, 8 insertions(+), 42 deletions(-)
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -97,7 +97,7 @@ void dbs_check_cpu(struct dbs_data *dbs_
policy = cdbs->cur_policy;
- /* Get Absolute Load (in terms of freq for ondemand gov) */
+ /* Get Absolute Load */
for_each_cpu(j, policy->cpus) {
struct cpu_dbs_common_info *j_cdbs;
u64 cur_wall_time, cur_idle_time;
@@ -148,14 +148,6 @@ void dbs_check_cpu(struct dbs_data *dbs_
load = 100 * (wall_time - idle_time) / wall_time;
- if (dbs_data->cdata->governor == GOV_ONDEMAND) {
- int freq_avg = __cpufreq_driver_getavg(policy, j);
- if (freq_avg <= 0)
- freq_avg = policy->cur;
-
- load *= freq_avg;
- }
-
if (load > max_load)
max_load = load;
}
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -169,7 +169,6 @@ struct od_dbs_tuners {
unsigned int sampling_rate;
unsigned int sampling_down_factor;
unsigned int up_threshold;
- unsigned int adj_up_threshold;
unsigned int powersave_bias;
unsigned int io_is_busy;
};
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -29,11 +29,9 @@
#include "cpufreq_governor.h"
/* On-demand governor macros */
-#define DEF_FREQUENCY_DOWN_DIFFERENTIAL (10)
#define DEF_FREQUENCY_UP_THRESHOLD (80)
#define DEF_SAMPLING_DOWN_FACTOR (1)
#define MAX_SAMPLING_DOWN_FACTOR (100000)
-#define MICRO_FREQUENCY_DOWN_DIFFERENTIAL (3)
#define MICRO_FREQUENCY_UP_THRESHOLD (95)
#define MICRO_FREQUENCY_MIN_SAMPLE_RATE (10000)
#define MIN_FREQUENCY_UP_THRESHOLD (11)
@@ -161,14 +159,10 @@ static void dbs_freq_increase(struct cpu
/*
* Every sampling_rate, we check, if current idle time is less than 20%
- * (default), then we try to increase frequency. Every sampling_rate, we look
- * for the lowest frequency which can sustain the load while keeping idle time
- * over 30%. If such a frequency exist, we try to decrease to this frequency.
- *
- * Any frequency increase takes it to the maximum frequency. Frequency reduction
- * happens at minimum steps of 5% (default) of current frequency
+ * (default), then we try to increase frequency. Else, we adjust the frequency
+ * proportional to load.
*/
-static void od_check_cpu(int cpu, unsigned int load_freq)
+static void od_check_cpu(int cpu, unsigned int load)
{
struct od_cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info, cpu);
struct cpufreq_policy *policy = dbs_info->cdbs.cur_policy;
@@ -178,29 +172,17 @@ static void od_check_cpu(int cpu, unsign
dbs_info->freq_lo = 0;
/* Check for frequency increase */
- if (load_freq > od_tuners->up_threshold * policy->cur) {
+ if (load > od_tuners->up_threshold) {
/* If switching to max speed, apply sampling_down_factor */
if (policy->cur < policy->max)
dbs_info->rate_mult =
od_tuners->sampling_down_factor;
dbs_freq_increase(policy, policy->max);
return;
- }
-
- /* Check for frequency decrease */
- /* if we cannot reduce the frequency anymore, break out early */
- if (policy->cur == policy->min)
- return;
-
- /*
- * The optimal frequency is the frequency that is the lowest that can
- * support the current CPU usage without triggering the up policy. To be
- * safe, we focus 10 points under the threshold.
- */
- if (load_freq < od_tuners->adj_up_threshold
- * policy->cur) {
+ } else {
+ /* Calculate the next frequency proportional to load */
unsigned int freq_next;
- freq_next = load_freq / od_tuners->adj_up_threshold;
+ freq_next = load * policy->cpuinfo.max_freq / 100;
/* No longer fully busy, reset rate_mult */
dbs_info->rate_mult = 1;
@@ -374,9 +356,6 @@ static ssize_t store_up_threshold(struct
input < MIN_FREQUENCY_UP_THRESHOLD) {
return -EINVAL;
}
- /* Calculate the new adj_up_threshold */
- od_tuners->adj_up_threshold += input;
- od_tuners->adj_up_threshold -= od_tuners->up_threshold;
od_tuners->up_threshold = input;
return count;
@@ -525,8 +504,6 @@ static int od_init(struct dbs_data *dbs_
if (idle_time != -1ULL) {
/* Idle micro accounting is supported. Use finer thresholds */
tuners->up_threshold = MICRO_FREQUENCY_UP_THRESHOLD;
- tuners->adj_up_threshold = MICRO_FREQUENCY_UP_THRESHOLD -
- MICRO_FREQUENCY_DOWN_DIFFERENTIAL;
/*
* In nohz/micro accounting case we set the minimum frequency
* not depending on HZ, but fixed (very low). The deferred
@@ -535,8 +512,6 @@ static int od_init(struct dbs_data *dbs_
dbs_data->min_sampling_rate = MICRO_FREQUENCY_MIN_SAMPLE_RATE;
} else {
tuners->up_threshold = DEF_FREQUENCY_UP_THRESHOLD;
- tuners->adj_up_threshold = DEF_FREQUENCY_UP_THRESHOLD -
- DEF_FREQUENCY_DOWN_DIFFERENTIAL;
/* For correct statistics, we need 10 ticks for each measure */
dbs_data->min_sampling_rate = MIN_SAMPLING_RATE_RATIO *
next prev parent reply other threads:[~2014-10-07 23:20 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-07 23:20 [PATCH 3.10 00/13] 3.10.57-stable review Greg Kroah-Hartman
2014-10-07 23:20 ` [PATCH 3.10 01/13] udf: Avoid infinite loop when processing indirect ICBs Greg Kroah-Hartman
2014-10-07 23:20 ` [PATCH 3.10 02/13] perf: fix perf bug in fork() Greg Kroah-Hartman
2014-10-07 23:20 ` [PATCH 3.10 03/13] init/Kconfig: Fix HAVE_FUTEX_CMPXCHG to not break up the EXPERT menu Greg Kroah-Hartman
2014-10-07 23:20 ` [PATCH 3.10 04/13] ring-buffer: Fix infinite spin in reading buffer Greg Kroah-Hartman
2014-10-07 23:20 ` [PATCH 3.10 05/13] mm, thp: move invariant bug check out of loop in __split_huge_page_map Greg Kroah-Hartman
2014-10-07 23:20 ` [PATCH 3.10 06/13] mm: numa: Do not mark PTEs pte_numa when splitting huge pages Greg Kroah-Hartman
2014-10-07 23:20 ` [PATCH 3.10 07/13] media: vb2: fix VBI/poll regression Greg Kroah-Hartman
2014-10-07 23:20 ` [PATCH 3.10 08/13] md/raid5: disable DISCARD by default due to safety concerns Greg Kroah-Hartman
2014-10-07 23:20 ` [PATCH 3.10 09/13] jiffies: Fix timeval conversion to jiffies Greg Kroah-Hartman
2014-10-07 23:20 ` [PATCH 3.10 10/13] drbd: fix regression out of mem, failed to invoke fence-peer helper Greg Kroah-Hartman
2014-10-07 23:20 ` [PATCH 3.10 11/13] nl80211: clear skb cb before passing to netlink Greg Kroah-Hartman
2014-10-07 23:20 ` [PATCH 3.10 12/13] cpufreq: Fix wrong time unit conversion Greg Kroah-Hartman
2014-10-07 23:20 ` Greg Kroah-Hartman [this message]
2014-10-08 2:49 ` [PATCH 3.10 00/13] 3.10.57-stable review Guenter Roeck
2014-10-08 20:06 ` Shuah Khan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141007231920.314796030@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=broonie@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=rafael.j.wysocki@intel.com \
--cc=stable@vger.kernel.org \
--cc=stratosk@semaphore.gr \
--cc=viresh.kumar@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).