public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Mike Galbraith <efault@gmx.de>
To: Peter Zijlstra <peterz@infradead.org>
Cc: lkml <linux-kernel@vger.kernel.org>,
	Suresh Siddha <suresh.b.siddha@intel.com>,
	Paul Turner <pjt@google.com>,
	Arjan Van De Ven <arjan@linux.intel.com>,
	Andreas Herrmann <andreas.herrmann3@amd.com>
Subject: Re: [rfc][patch] select_idle_sibling() inducing bouncing on westmere
Date: Sun, 27 May 2012 11:17:39 +0200	[thread overview]
Message-ID: <1338110259.7678.77.camel@marge.simpson.net> (raw)
In-Reply-To: <1338020834.7747.8.camel@marge.simpson.net>

On Sat, 2012-05-26 at 10:27 +0200, Mike Galbraith wrote: 
> Hohum, back to finding out what happened to cpufreq.

Answer: nothing.. in mainline.

I test performance habitually, so just never noticed how bad ondemand
sucks.  In enterprise, I found the below, explaining why cores crank up
fine there, but not in mainline.  Somebody thumped ondemand properly on
it's pointy head.

But, check out the numbers below this, and you can see just how horrible
bouncing is when you add governor latency _on top_ of it. 

---
drivers/cpufreq/cpufreq_ondemand.c |   25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)

--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -37,6 +37,7 @@
#define MICRO_FREQUENCY_MIN_SAMPLE_RATE (10000)
#define MIN_FREQUENCY_UP_THRESHOLD (11)
#define MAX_FREQUENCY_UP_THRESHOLD (100)
+#define MAX_DEFAULT_SAMPLING_RATE (300 * 1000U)

/*
  * The polling frequency of this governor depends on the capability of
@@ -733,6 +734,30 @@ static int cpufreq_governor_dbs(struct c
max(min_sampling_rate,
    latency * LATENCY_MULTIPLIER);
dbs_tuners_ins.io_is_busy = should_io_be_busy();
+ /*
+ * Cut def_sampling rate to 300ms if it was above,
+ * still consider to not set it above latency
+ * transition * 100
+ */
+ if (dbs_tuners_ins.sampling_rate > MAX_DEFAULT_SAMPLING_RATE) {
+ dbs_tuners_ins.sampling_rate =
+ max(min_sampling_rate, MAX_DEFAULT_SAMPLING_RATE);
+ printk(KERN_INFO "CPUFREQ: ondemand sampling "
+        "rate set to %d ms\n",
+        dbs_tuners_ins.sampling_rate / 1000);
+ }
+ /*
+ * Be conservative in respect to performance.
+ * If an application calculates using two threads
+ * depending on each other, they will be run on several
+ * CPU cores resulting on 50% load on both.
+ * SLED might still want to prefer 80% up_threshold
+ * by default, but we cannot differ that here.
+ */
+ if (num_online_cpus() > 1)
+ dbs_tuners_ins.up_threshold =
+ DEF_FREQUENCY_UP_THRESHOLD / 2;
+
}
mutex_unlock(&dbs_mutex);


patches applied to both trees
patches/remove_irritating_plus.diff
patches/clockevents-Reinstate-the-per-cpu-tick-skew.patch
patches/sched-cgroups-Disallow-attaching-kthreadd
patches/sched-fix-task_groups-list
patches/sched-rt-fix-isolated-CPUs-leaving-root_task_group-indefinitely-throttled.patch
patches/sched-throttle-nohz.patch
patches/sched-domain-flags-proc-handler.patch
patches/sched-fix-Q6600.patch
patches/cpufreq_ondemand_performance_optimise_default_settings.patch

applied only to 3.4.0x
patches/sched-tweak-select_idle_sibling.patch 

tbench 1
3.4.0          351 MB/sec ondemand
               350 MB/sec
               351 MB/sec

3.4.0x         428 MB/sec ondemand
               432 MB/sec
               425 MB/sec
vs 3.4.0       1.22

3.4.0          363 MB/sec performance
               369 MB/sec
               359 MB/sec
               
3.4.0x         432 MB/sec performance
               430 MB/sec
               427 MB/sec
vs 3.4.0       1.18

netperf TCP_RR  1 byte ping/pong (trans/sec)

governor ondemand
                 unbound          bound
3.4.0              72851         128433
                   72347         127301
                   72512         127472
         
3.4.0x            128440         131979
                  128116         132413
                  128366         132004
vs 3.4.0           1.768          1.034
                   ^^^^^ eek!     (hm, why bound improvement?)

governor performance
3.4.0             105199         127140
                  104534         128786
                  104167         127920

3.4.0x            123451         132883
                  128702         132688
                  125653         133005
vs 3.4.0           1.203          1.038
                                  (hm, why bound improvement?)

select_idle_sibling() becomes a proper throughput/latency trade on
Westmere as well, with only modest cost even for worst case load that
does at least a dinky bit of work (TCP_RR == 100% synchronous).

-Mike


  reply	other threads:[~2012-05-27  9:17 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-24 11:04 [rfc][patch] select_idle_sibling() inducing bouncing on westmere Mike Galbraith
2012-05-24 13:17 ` Peter Zijlstra
2012-05-24 13:20   ` Peter Zijlstra
2012-05-25  6:14     ` Mike Galbraith
2012-05-26  6:37       ` Mike Galbraith
2012-05-26  7:29         ` Peter Zijlstra
2012-05-26  8:27           ` Mike Galbraith
2012-05-27  9:17             ` Mike Galbraith [this message]
2012-05-27 11:02               ` Mike Galbraith
2012-05-27 11:12               ` Mike Galbraith
2012-05-27 14:11               ` Arjan van de Ven
2012-05-27 14:29                 ` Mike Galbraith
2012-05-27 14:32                   ` Mike Galbraith
2012-05-29 18:58           ` Andreas Herrmann
2012-05-25  6:08   ` Mike Galbraith
2012-05-25  8:06     ` Mike Galbraith
2012-06-05 14:30   ` Mike Galbraith
2012-06-11 16:57     ` [patch v3] sched: fix select_idle_sibling() induced bouncing Mike Galbraith
2012-06-11 17:22       ` Peter Zijlstra
2012-06-11 17:55         ` Mike Galbraith
2012-06-11 18:53           ` Suresh Siddha
2012-06-12  3:18             ` Mike Galbraith
2012-06-20 10:48               ` [tip:sched/core] sched: Improve scalability via 'CPU buddies', which withstand random perturbations tip-bot for Mike Galbraith
2012-07-24 14:18               ` tip-bot for Mike Galbraith
2012-06-19  8:47         ` [patch v3] sched: fix select_idle_sibling() induced bouncing Paul Turner
2012-06-06 10:17   ` [rfc][patch] select_idle_sibling() inducing bouncing on westmere Mike Galbraith
2012-06-06 10:38     ` Mike Galbraith

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1338110259.7678.77.camel@marge.simpson.net \
    --to=efault@gmx.de \
    --cc=andreas.herrmann3@amd.com \
    --cc=arjan@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=suresh.b.siddha@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox