linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH v8 2/4] sched: Rewrite runnable load and utilization average tracking
       [not found] ` <1432518587-114210-3-git-send-email-yuyang.du@intel.com>
@ 2015-05-26 16:06   ` Vincent Guittot
  2015-05-27 22:36     ` Yuyang Du
  0 siblings, 1 reply; 13+ messages in thread
From: Vincent Guittot @ 2015-05-26 16:06 UTC (permalink / raw)
  To: Yuyang Du
  Cc: Ingo Molnar, Peter Zijlstra, linux-kernel, Paul Turner,
	Benjamin Segall, Morten Rasmussen, Dietmar Eggemann,
	arjan.van.de.ven, Len Brown, rafael.j.wysocki,
	fengguang.wu@intel.com

On 25 May 2015 at 03:49, Yuyang Du <yuyang.du@intel.com> wrote:
[snip]

>
> @@ -2585,334 +2583,156 @@ static __always_inline int __update_entity_runnable_avg(u64 now, int cpu,
>                 periods = delta / 1024;
>                 delta %= 1024;
>
> -               sa->runnable_avg_sum = decay_load(sa->runnable_avg_sum,
> -                                                 periods + 1);
> -               sa->running_avg_sum = decay_load(sa->running_avg_sum,
> -                                                 periods + 1);
> -               sa->avg_period = decay_load(sa->avg_period,
> -                                                    periods + 1);
> +               sa->load_sum = decay_load(sa->load_sum, periods + 1);
> +               sa->util_sum = decay_load(u64(sa->util_sum), periods + 1);

Hi Yuyang,

Brackets are missing around u64 to cast util_sum


>
>                 /* Efficiently calculate \sum (1..n_period) 1024*y^i */
> -               runnable_contrib = __compute_runnable_contrib(periods);
> -               if (runnable)
> -                       sa->runnable_avg_sum += runnable_contrib;
> +               contrib = __compute_runnable_contrib(periods);
> +               if (weight)

>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v8 2/4] sched: Rewrite runnable load and utilization average tracking
  2015-05-26 16:06   ` [PATCH v8 2/4] sched: Rewrite runnable load and utilization average tracking Vincent Guittot
@ 2015-05-27 22:36     ` Yuyang Du
  0 siblings, 0 replies; 13+ messages in thread
From: Yuyang Du @ 2015-05-27 22:36 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Ingo Molnar, Peter Zijlstra, linux-kernel, Paul Turner,
	Benjamin Segall, Morten Rasmussen, Dietmar Eggemann,
	arjan.van.de.ven, Len Brown, rafael.j.wysocki,
	fengguang.wu@intel.com

On Tue, May 26, 2015 at 06:06:23PM +0200, Vincent Guittot wrote:
> > +               sa->util_sum = decay_load(u64(sa->util_sum), periods + 1);
> 
> Brackets are missing around u64 to cast util_sum
> 

My appology for this, and thank you, Vincent.

Sending the below patch here instead of sending the series.

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 2dd201e..a8fd7b9 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2584,7 +2584,7 @@ static __always_inline int __update_load_avg(u64 now, int cpu,
 		delta %= 1024;
 
 		sa->load_sum = decay_load(sa->load_sum, periods + 1);
-		sa->util_sum = decay_load(u64(sa->util_sum), periods + 1);
+		sa->util_sum = decay_load((u64)(sa->util_sum), periods + 1);
 
 		/* Efficiently calculate \sum (1..n_period) 1024*y^i */
 		contrib = __compute_runnable_contrib(periods);

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v8 0/4] sched: Rewrite runnable load and utilization average tracking
       [not found] <1432518587-114210-1-git-send-email-yuyang.du@intel.com>
       [not found] ` <1432518587-114210-3-git-send-email-yuyang.du@intel.com>
@ 2015-06-02  0:25 ` Yuyang Du
  2015-06-09  1:21 ` Yuyang Du
  2 siblings, 0 replies; 13+ messages in thread
From: Yuyang Du @ 2015-06-02  0:25 UTC (permalink / raw)
  To: mingo, peterz, linux-kernel
  Cc: pjt, bsegall, morten.rasmussen, vincent.guittot, dietmar.eggemann,
	arjan.van.de.ven, len.brown, rafael.j.wysocki, fengguang.wu

Ping once more...

On Mon, May 25, 2015 at 09:49:43AM +0800, Yuyang Du wrote:
> Hi Peter and Ingo,
> 
> Changes are made for the 8th version:
> 
> 1) Rebase to the latest tip tree
> 2) scale_load_down the weight when doing the averages
> 3) change util_sum to u32
> 
> Thanks a lot for Ben's comments, which lead to this version.
> 
> Regards,
> Yuyang
> 
> v7 changes:
> 
> The 7th version mostly is to accommodate the utilization load average recently
> merged into kernel. The general idea is as well to update the cfs_rq as a whole
> as opposed to only updating an entity at a time and update the cfs_rq with the
> only updated entity.
> 
> 1) Rename utilization_load_avg to util_avg to be concise and meaningful
> 
> 2) To track the cfs_rq util_avg, simply use "cfs_rq->curr != NULL" as the
> predicate. This should be equivalent to but simpler than aggregating each
> individual child sched_entity's util_avg when "cfs_rq->curr == se". Because
> if cfs_rq->curr != NULL, the cfs_rq->curr has to be some se.
> 
> 3) Remove se's util_avg from its cfs_rq's when migrating it, this was already
> proposed by Morten and patches sent
> 
> 3) The group entity's load average is initiated when the entity is created
> 
> 4) Small nits: the entity's util_avg is removed from switched_from_fair()
> and task_move_group_fair().
> 
> Thanks a lot for Vincent and Morten's help for the 7th version.
> 
> Thanks,
> Yuyang
> 
> v6 changes:
> 
> Many thanks to PeterZ for his review, to Dietmar, and to Fengguang for 0Day and LKP.
> 
> Rebased on v3.18-rc2.
> 
> - Unify decay_load 32 and 64 bits by mul_u64_u32_shr
> - Add force option in update_tg_load_avg
> - Read real-time cfs's load_avg for calc_tg_weight
> - Have tg_load_avg_contrib ifdef CONFIG_FAIR_GROUP_SCHED
> - Bug fix
> 
> v5 changes:
> 
> Thank Peter intensively for reviewing this patchset in detail and all his comments.
> And Mike for general and cgroup pipe-test. Morten, Ben, and Vincent in the discussion.
> 
> - Remove dead task and task group load_avg
> - Do not update trivial delta to task_group load_avg (threshold 1/64 old_contrib)
> - mul_u64_u32_shr() is used in decay_load, so on 64bit, load_sum can afford
>   about 4353082796 (=2^64/47742/88761) entities with the highest weight (=88761)
>   always runnable, greater than previous theoretical maximum 132845
> - Various code efficiency and style changes
> 
> We carried out some performance tests (thanks to Fengguang and his LKP). The results
> are shown as follows. The patchset (including threepatches) is on top of mainline
> v3.16-rc5. We may report more perf numbers later.
> 
> Overall, this rewrite has better performance, and reduced net overhead in load
> average tracking, flat efficiency in multi-layer cgroup pipe-test.
> 
> v4 changes:
> 
> Thanks to Morten, Ben, and Fengguang for v4 revision.
> 
> - Insert memory barrier before writing cfs_rq->load_last_update_copy.
> - Fix typos.
> 
> v3 changes:
> 
> Many thanks to Ben for v3 revision.
> 
> Regarding the overflow issue, we now have for both entity and cfs_rq:
> 
> struct sched_avg {
>     .....
>     u64 load_sum;
>     unsigned long load_avg;
>     .....
> };
> 
> Given the weight for both entity and cfs_rq is:
> 
> struct load_weight {
>     unsigned long weight;
>     .....
> };
> 
> So, load_sum's max is 47742 * load.weight (which is unsigned long), then on 32bit,
> it is absolutly safe. On 64bit, with unsigned long being 64bit, but we can afford
> about 4353082796 (=2^64/47742/88761) entities with the highest weight (=88761)
> always runnable, even considering we may multiply 1<<15 in decay_load64, we can
> still support 132845 (=4353082796/2^15) always runnable, which should be acceptible.
> 
> load_avg = load_sum / 47742 = load.weight (which is unsigned long), so it should be
> perfectly safe for both entity (even with arbitrary user group share) and cfs_rq on
> both 32bit and 64bit. Originally, we saved this division, but have to get it back
> because of the overflow issue on 32bit (actually load average itself is safe from
> overflow, but the rest of the code referencing it always uses long, such as cpu_load,
> etc., which prevents it from saving).
> 
> - Fix overflow issue both for entity and cfs_rq on both 32bit and 64bit.
> - Track all entities (both task and group entity) due to group entity's clock issue.
>   This actually improves code simplicity.
> - Make a copy of cfs_rq sched_avg's last_update_time, to read an intact 64bit
>   variable on 32bit machine when in data race (hope I did it right).
> - Minor fixes and code improvement.
> 
> v2 changes:
> 
> Thanks to PeterZ and Ben for their help in fixing the issues and improving
> the quality, and Fengguang and his 0Day in finding compile errors in different
> configurations for version 2.
> 
> - Batch update the tg->load_avg, making sure it is up-to-date before update_cfs_shares
> - Remove migrating task from the old CPU/cfs_rq, and do so with atomic operations
> 
> Yuyang Du (4):
>   sched: Remove rq's runnable avg
>   sched: Rewrite runnable load and utilization average tracking
>   sched: Init cfs_rq's sched_entity load average
>   sched: Remove task and group entity load when they are dead
> 
>  include/linux/sched.h |  40 ++-
>  kernel/sched/core.c   |   5 +-
>  kernel/sched/debug.c  |  42 +---
>  kernel/sched/fair.c   | 668 +++++++++++++++++---------------------------------
>  kernel/sched/sched.h  |  32 +--
>  5 files changed, 261 insertions(+), 526 deletions(-)
> 
> -- 
> 2.1.3

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v8 0/4] sched: Rewrite runnable load and utilization average tracking
       [not found] <1432518587-114210-1-git-send-email-yuyang.du@intel.com>
       [not found] ` <1432518587-114210-3-git-send-email-yuyang.du@intel.com>
  2015-06-02  0:25 ` [PATCH v8 0/4] " Yuyang Du
@ 2015-06-09  1:21 ` Yuyang Du
  2015-06-15 10:38   ` Boqun Feng
  2 siblings, 1 reply; 13+ messages in thread
From: Yuyang Du @ 2015-06-09  1:21 UTC (permalink / raw)
  To: mingo, peterz, linux-kernel
  Cc: pjt, bsegall, morten.rasmussen, vincent.guittot, dietmar.eggemann,
	arjan.van.de.ven, len.brown, rafael.j.wysocki, fengguang.wu

Ping ...

Plus some data tested by LKP:

To name a few host configurations:

host: brickland3
model: Brickland Ivy Bridge-EX
nr_cpu: 120
memory: 512G

host: lkp-a03
model: Atom
memory: 8G

host: grantley
model: Grantley Haswell-EP
memory: 32G

host: ivb43
model: Ivytown Ivy Bridge-EP
nr_cpu: 48
memory: 64G

       base       4c59e142fee20a57617cff2250  testbox/testcase/testparams
----------------  --------------------------  ---------------------------
         %stddev     %change         %stddev
             \          |                \  
      1620 ±  0%     -12.3%       1421 ±  0%  aim7/performance-100-fork_test/lkp-a03
       232 ±  0%      +0.2%        233 ±  0%  aim7/performance-20-sieve/lkp-a03
      3409 ±  0%      +0.0%       3410 ±  0%  aim7/performance-200-array_rtns/lkp-a03
      6784 ±  0%      +1.1%       6860 ±  1%  aim7/performance-200-dir_rtns_1/lkp-a03
     26745 ±  2%     +33.6%      35721 ±  0%  aim7/performance-2000-fork_test/brickland3
     41873 ±  0%      +0.4%      42027 ±  0%  aim7/performance-3000-brk_test/brickland3
     44047 ± 30%     -45.2%      24135 ± 18%  aim7/performance-3000-disk_cp/brickland3
     18414 ± 16%     +90.5%      35078 ±  0%  aim7/performance-3000-disk_rd/brickland3
     71783 ±  2%     +11.7%      80188 ±  0%  aim7/performance-3000-sieve/brickland3
     10132 ±  0%      +0.0%      10136 ±  0%  aim7/performance-600-add_float/lkp-a03
     13005 ±  0%      -1.3%      12842 ±  0%  aim7/performance-600-creat-clo/lkp-a03
     18956 ±  0%      -0.2%      18924 ±  0%  aim7/performance-600-disk_cp/lkp-a03
     12324 ±  0%      +0.4%      12375 ±  0%  aim7/performance-600-mem_rtns_1/lkp-a03
    661018 ±  3%      +9.6%     724296 ±  2%  aim7/performance-8000-disk_src/brickland3
   1165368 ± 16%     +19.0%    1386957 ± 40%  aim7/performance-8000-disk_wrt/brickland3
   1178357 ±  0%      -1.7%    1157892 ±  0%  aim7/performance-8000-mem_rtns_1/brickland3
    882191 ±  0%     +27.2%    1122167 ±  0%  aim7/performance-8000-misc_rtns_1/brickland3
   1220753 ±  1%      -1.0%    1208609 ±  1%  aim7/performance-8000-sort_rtns_1/brickland3
    792481 ±  0%      -0.1%     791469 ±  0%  aim7/performance-8000-string_rtns/brickland3
     43506            +4.3%      45368        GEO-MEAN aim7.jobs-per-min

       base       4c59e142fee20a57617cff2250  
----------------  --------------------------  
     79.57 ±  1%      -2.1%      77.94 ±  1%  kbuild/performance-200%/grantley
     87.34 ±  0%      +5.3%      91.94 ±  0%  kbuild/performance-50%/grantley
     89.57 ±  0%      +1.6%      90.97 ±  0%  kbuild/powersave-200%/grantley
     98.61 ±  0%      +6.9%     105.42 ±  0%  kbuild/powersave-50%/grantley
     88.51            +2.9%      91.04        GEO-MEAN kbuild.time.elapsed_time

       base       4c59e142fee20a57617cff2250  
----------------  --------------------------  
     12266 ±  0%      +0.3%      12305 ±  0%  ebizzy/performance-200%-100x-10s/lkp-nex04
     12265            +0.3%      12304        GEO-MEAN ebizzy.throughput

       base       4c59e142fee20a57617cff2250  
----------------  --------------------------  
     10414 ±  3%      +0.7%      10484 ±  2%  fsmark/performance-1x-1t-1BRD_32G-ext4-4K-4G-fsyncBeforeClose-1fpd/ivb43
      1571 ±  0%      +1.7%       1599 ±  1%  fsmark/performance-1x-1t-1BRD_32G-ext4-nfsv4-4K-4G-fsyncBeforeClose-1fpd/ivb43
      5294 ±  0%      -0.1%       5289 ±  0%  fsmark/performance-1x-1t-1BRD_32G-xfs-4K-4G-fsyncBeforeClose-1fpd/ivb43
      1401 ±  0%      +2.6%       1439 ±  0%  fsmark/performance-1x-1t-1BRD_32G-xfs-nfsv4-4K-4G-fsyncBeforeClose-1fpd/ivb43
      3320            +1.2%       3361        GEO-MEAN fsmark.files_per_sec

       base       4c59e142fee20a57617cff2250  
----------------  --------------------------  
      4550 ±  0%      +0.5%       4570 ±  0%  ftq/performance-100%-20x-100000ss/lkp-nex06
      4550            +0.5%       4570        GEO-MEAN ftq.counts

       base       4c59e142fee20a57617cff2250  
----------------  --------------------------  
      0.61 ±  0%      +0.1%       0.61 ±  0%  ftq/performance-100%-20x-100000ss/lkp-nex06
      0.61            +0.1%       0.61        GEO-MEAN ftq.stddev

       base       4c59e142fee20a57617cff2250  
----------------  --------------------------  
  13062333 ±  0%      +0.4%   13110392 ±  0%  fwq/performance-disable_mtrr_trim-100%-20x-100000ss/loslunas
  13062333            +0.4%   13110391        GEO-MEAN fwq.counts

       base       4c59e142fee20a57617cff2250  
----------------  --------------------------  
      0.08 ± 13%     -22.9%       0.06 ±  1%  fwq/performance-disable_mtrr_trim-100%-20x-100000ss/loslunas
      0.08           -22.9%       0.06        GEO-MEAN fwq.stddev

       base       4c59e142fee20a57617cff2250  
----------------  --------------------------  
    151157 ±  7%     +47.3%     222713 ±  1%  hackbench/1600%-process-pipe/lkp-sb03
     60491 ±  5%     +11.8%      67652 ±  3%  hackbench/1600%-process-pipe/lkp-st02
    149667 ±  3%     +10.8%     165778 ±  1%  hackbench/1600%-process-pipe/lkp-ws02
     53252 ±  2%     +12.2%      59761 ±  0%  hackbench/1600%-process-pipe/nhm-white
    131798 ±  0%      +1.5%     133733 ±  0%  hackbench/1600%-process-socket/lkp-sb03
     26157 ±  0%      -3.1%      25359 ±  0%  hackbench/1600%-process-socket/lkp-st02
     33566 ±  0%      -0.8%      33285 ±  0%  hackbench/1600%-process-socket/nhm-white
     13211 ± 14%     +17.9%      15582 ± 22%  hackbench/1600%-process-socket/vm-vp-2G
    156136 ±  3%     +15.5%     180382 ±  4%  hackbench/1600%-threads-pipe/lkp-sb03
     50713 ±  3%     +16.2%      58907 ±  3%  hackbench/1600%-threads-pipe/lkp-st02
    141579 ±  3%     -11.1%     125831 ±  3%  hackbench/1600%-threads-pipe/lkp-ws02
     50283 ±  0%      -0.1%      50218 ±  4%  hackbench/1600%-threads-pipe/nhm-white
     16756 ± 35%     +82.8%      30626 ± 23%  hackbench/1600%-threads-pipe/vm-vp-2G
    141127 ±  0%      -0.7%     140128 ±  0%  hackbench/1600%-threads-socket/lkp-sb03
     25519 ±  0%      -1.7%      25091 ±  0%  hackbench/1600%-threads-socket/lkp-st02
     95356 ±  1%      +2.7%      97902 ±  0%  hackbench/1600%-threads-socket/lkp-ws02
     45342 ±  1%     +22.9%      55728 ±  2%  hackbench/50%-process-pipe/lkp-sb03
     37594 ±  2%     +20.3%      45227 ±  1%  hackbench/50%-process-socket/lkp-sb03
     41087 ±  2%     +28.3%      52698 ±  3%  hackbench/50%-threads-pipe/lkp-sb03
     32685 ±  2%     +29.3%      42266 ±  0%  hackbench/50%-threads-socket/lkp-sb03
     16722 ±  5%      +6.5%      17809 ±  3%  hackbench/performance-1600%-process-pipe/avoton3
     36409 ±  2%      +6.1%      38631 ±  1%  hackbench/performance-1600%-process-pipe/bay
    210022 ± 21%      +8.5%     227837 ±  9%  hackbench/performance-1600%-process-pipe/brickland3
    179759 ±  1%     +20.9%     217284 ±  3%  hackbench/performance-1600%-process-pipe/grantley
    200679 ±  6%      +9.1%     218923 ±  4%  hackbench/performance-1600%-process-pipe/ivb41
    198924 ±  2%     +11.4%     221634 ±  5%  hackbench/performance-1600%-process-pipe/ivb42
     90326 ±  4%     +81.4%     163853 ±  0%  hackbench/performance-1600%-process-pipe/lituya
      3841 ±  2%      +0.3%       3853 ±  1%  hackbench/performance-1600%-process-pipe/lkp-a03
      3878 ±  1%      -1.7%       3814 ±  2%  hackbench/performance-1600%-process-pipe/lkp-a04
      3756 ±  2%      +2.0%       3831 ±  2%  hackbench/performance-1600%-process-pipe/lkp-a06
     17302 ± 11%     +33.1%      23026 ±  4%  hackbench/performance-1600%-process-pipe/lkp-bdw02
     11552 ±  4%      -3.9%      11099 ±  5%  hackbench/performance-1600%-process-pipe/lkp-bdw03
     15863            +3.2%      16373 ±  3%  hackbench/performance-1600%-process-pipe/lkp-bsw01
    238403 ±  2%      +6.5%     254009 ±  1%  hackbench/performance-1600%-process-pipe/lkp-hsw01
    290714 ± 23%     -16.3%     243224 ± 22%  hackbench/performance-1600%-process-pipe/lkp-hsx02
     95403 ±  1%     +33.7%     127540 ±  3%  hackbench/performance-1600%-process-pipe/lkp-ne04
    217280 ±  1%      +5.2%     228664 ±  3%  hackbench/performance-1600%-process-pipe/lkp-nex04
    161857 ±  9%     +65.7%     268277 ±  4%  hackbench/performance-1600%-process-pipe/lkp-sbx04
      8584 ±  6%     -14.8%       7316 ±  4%  hackbench/performance-1600%-process-pipe/lkp-t410
     61053 ±  2%     +13.2%      69135 ±  2%  hackbench/performance-1600%-process-pipe/nhm4
     94209 ±  0%     +18.3%     111445 ±  1%  hackbench/performance-1600%-process-pipe/wsm
      9275 ±  2%      -2.7%       9025 ±  1%  hackbench/performance-1600%-process-socket/avoton3
     17454 ±  0%      -3.6%      16829 ±  0%  hackbench/performance-1600%-process-socket/bay
    186603 ±  0%      +0.1%     186712 ±  0%  hackbench/performance-1600%-process-socket/grantley
    141698 ±  0%      -0.0%     141685        hackbench/performance-1600%-process-socket/ivb41
    141796 ±  0%      -1.0%     140383 ±  0%  hackbench/performance-1600%-process-socket/ivb42
     91118 ±  0%      -0.0%      91080 ±  0%  hackbench/performance-1600%-process-socket/lituya
      2682 ±  0%      -3.5%       2589 ±  1%  hackbench/performance-1600%-process-socket/lkp-a03
      2689 ±  1%      -3.2%       2603 ±  1%  hackbench/performance-1600%-process-socket/lkp-a04
      2677 ±  2%      -3.4%       2585 ±  1%  hackbench/performance-1600%-process-socket/lkp-a06
     17793 ±  0%      -1.9%      17454 ±  0%  hackbench/performance-1600%-process-socket/lkp-bdw01
     17810 ±  0%      -2.3%      17392 ±  0%  hackbench/performance-1600%-process-socket/lkp-bdw02
     10438 ±  2%      -6.6%       9753 ±  3%  hackbench/performance-1600%-process-socket/lkp-bdw03
     71835 ±  0%      -1.4%      70825 ±  0%  hackbench/performance-1600%-process-socket/lkp-ne04
    126804 ±  0%      +2.3%     129735 ±  0%  hackbench/performance-1600%-process-socket/lkp-nex04
      6038 ±  1%      -7.6%       5579 ±  1%  hackbench/performance-1600%-process-socket/lkp-t410
    193514 ±  1%      -0.4%     192644 ±  1%  hackbench/performance-1600%-process-socket/lkp-wsx02
     38781 ±  0%      -2.7%      37729 ±  0%  hackbench/performance-1600%-process-socket/nhm4
     61930 ±  0%      -2.8%      60175 ±  0%  hackbench/performance-1600%-process-socket/wsm
     16234 ±  4%      +1.3%      16451 ±  3%  hackbench/performance-1600%-threads-pipe/avoton3
     33783 ±  1%      +4.3%      35236 ±  0%  hackbench/performance-1600%-threads-pipe/bay
     87784 ±  4%     +19.4%     104776 ±  5%  hackbench/performance-1600%-threads-pipe/lituya
      3984 ±  3%      +0.3%       3997 ±  1%  hackbench/performance-1600%-threads-pipe/lkp-a03
      3971 ±  2%      -2.2%       3882 ±  1%  hackbench/performance-1600%-threads-pipe/lkp-a04
      3965 ±  2%      -2.0%       3885 ±  1%  hackbench/performance-1600%-threads-pipe/lkp-a06
     16813 ±  4%     +10.0%      18489 ±  5%  hackbench/performance-1600%-threads-pipe/lkp-bdw01
     14705 ±  7%     +23.1%      18108 ±  4%  hackbench/performance-1600%-threads-pipe/lkp-bdw02
     10138 ±  6%      +4.4%      10584 ±  5%  hackbench/performance-1600%-threads-pipe/lkp-bdw03
    296586 ±  4%     +44.8%     429530 ±  2%  hackbench/performance-1600%-threads-pipe/lkp-hsx03
     89740 ±  4%      -3.0%      87023 ±  1%  hackbench/performance-1600%-threads-pipe/lkp-ne04
      7199 ±  2%      -2.3%       7034 ±  5%  hackbench/performance-1600%-threads-pipe/lkp-t410
    160871 ±  3%     +40.7%     226283 ±  3%  hackbench/performance-1600%-threads-pipe/lkp-wsx02
     59435 ±  2%      +1.4%      60289 ±  2%  hackbench/performance-1600%-threads-pipe/nhm4
     88130 ±  1%      +2.6%      90453 ±  2%  hackbench/performance-1600%-threads-pipe/wsm
      8511 ±  3%      -3.7%       8198 ±  1%  hackbench/performance-1600%-threads-socket/avoton3
    150335 ±  0%      -2.0%     147334 ±  0%  hackbench/performance-1600%-threads-socket/ivb42
     88587 ±  0%      -0.0%      88581 ±  0%  hackbench/performance-1600%-threads-socket/lituya
      2545 ±  0%      -9.0%       2316 ±  1%  hackbench/performance-1600%-threads-socket/lkp-a03
      2576 ±  2%      -8.1%       2367 ±  1%  hackbench/performance-1600%-threads-socket/lkp-a04
      2561 ±  2%      -8.9%       2333 ±  0%  hackbench/performance-1600%-threads-socket/lkp-a06
     16940 ±  0%      -2.0%      16594 ±  0%  hackbench/performance-1600%-threads-socket/lkp-bdw01
     16909 ±  0%      -2.3%      16523 ±  0%  hackbench/performance-1600%-threads-socket/lkp-bdw02
      6036 ± 41%     +10.9%       6693 ± 36%  hackbench/performance-1600%-threads-socket/lkp-bdw03
     72247 ±  1%      -1.9%      70848 ±  0%  hackbench/performance-1600%-threads-socket/lkp-ne04
    130287 ±  2%      +1.3%     131979 ±  0%  hackbench/performance-1600%-threads-socket/lkp-nex04
      5284 ±  2%     -11.4%       4684 ±  1%  hackbench/performance-1600%-threads-socket/lkp-t410
    182493 ±  2%      +4.4%     190486 ±  1%  hackbench/performance-1600%-threads-socket/lkp-wsx02
     34589 ±  0%      -2.7%      33639 ±  0%  hackbench/performance-1600%-threads-socket/nhm4
     62138 ±  0%      -2.5%      60563 ±  0%  hackbench/performance-1600%-threads-socket/wsm
     25854 ±  2%     +11.2%      28760 ±  2%  hackbench/performance-50%-process-pipe/brickland3
     48893 ±  2%     +17.2%      57280 ±  3%  hackbench/performance-50%-process-pipe/grantley
     52874 ±  3%     +30.2%      68845 ±  3%  hackbench/performance-50%-process-pipe/ivb42
     32505 ±  1%     +14.1%      37079 ±  1%  hackbench/performance-50%-process-pipe/lituya
      8579 ±  2%      -1.8%       8428 ±  0%  hackbench/performance-50%-process-pipe/lkp-bdw01
      8635 ±  1%      -2.0%       8465 ±  0%  hackbench/performance-50%-process-pipe/lkp-bdw02
      6520 ±  1%     -15.3%       5523 ± 17%  hackbench/performance-50%-process-pipe/lkp-bdw03
     54195 ±  1%     +30.8%      70864 ±  0%  hackbench/performance-50%-process-pipe/lkp-hsw01
     31152 ±  1%     +10.7%      34482 ±  2%  hackbench/performance-50%-process-pipe/lkp-hsx03
     24170 ±  4%     +14.4%      27645 ±  4%  hackbench/performance-50%-process-socket/brickland3
     37786 ±  1%     +31.0%      49489 ±  0%  hackbench/performance-50%-process-socket/grantley
     43874 ±  1%     +33.5%      58567 ±  2%  hackbench/performance-50%-process-socket/ivb42
     23387 ±  0%     +16.3%      27202 ±  0%  hackbench/performance-50%-process-socket/lituya
      5657 ±  0%      +1.6%       5747 ±  0%  hackbench/performance-50%-process-socket/lkp-bdw01
      5650 ±  0%      +2.0%       5762 ±  0%  hackbench/performance-50%-process-socket/lkp-bdw02
      4379 ±  1%      +3.0%       4511 ±  1%  hackbench/performance-50%-process-socket/lkp-bdw03
     41463 ±  4%     +37.7%      57110 ±  1%  hackbench/performance-50%-process-socket/lkp-hsw01
     29333 ±  3%     +17.6%      34494 ±  1%  hackbench/performance-50%-process-socket/lkp-hsx02
     27757 ±  0%     +28.4%      35635 ±  1%  hackbench/performance-50%-process-socket/lkp-hsx03
     25439 ±  4%     +26.9%      32274 ±  7%  hackbench/performance-50%-process-socket/lkp-sbx04
     25117 ±  3%     +16.9%      29363 ±  3%  hackbench/performance-50%-threads-pipe/brickland3
     48543 ±  1%      +8.2%      52520 ±  3%  hackbench/performance-50%-threads-pipe/grantley
     48614 ±  4%     +32.1%      64240 ±  4%  hackbench/performance-50%-threads-pipe/ivb42
     29186 ±  1%     +22.8%      35840 ±  1%  hackbench/performance-50%-threads-pipe/lituya
      8037 ±  2%      +0.5%       8074 ±  0%  hackbench/performance-50%-threads-pipe/lkp-bdw02
      6233 ±  1%      -7.3%       5776 ±  1%  hackbench/performance-50%-threads-pipe/lkp-bdw03
     51951 ±  2%     +25.3%      65118 ±  3%  hackbench/performance-50%-threads-pipe/lkp-hsw01
     32037 ±  1%     +22.3%      39191 ±  4%  hackbench/performance-50%-threads-pipe/lkp-hsx02
     34492 ±  4%      +6.2%      36618 ±  3%  hackbench/performance-50%-threads-pipe/lkp-sbx04
     23770 ±  2%     +22.3%      29071 ±  5%  hackbench/performance-50%-threads-socket/brickland3
     35592 ±  1%     +24.5%      44321 ±  4%  hackbench/performance-50%-threads-socket/grantley
     38872 ±  2%     +36.3%      52965 ±  3%  hackbench/performance-50%-threads-socket/ivb42
     21302 ±  1%     +25.5%      26743 ±  2%  hackbench/performance-50%-threads-socket/lituya
      5380 ±  0%      +4.0%       5595 ±  0%  hackbench/performance-50%-threads-socket/lkp-bdw01
      5348 ±  0%      +4.5%       5586 ±  0%  hackbench/performance-50%-threads-socket/lkp-bdw02
      4198 ±  0%      +5.1%       4413 ±  1%  hackbench/performance-50%-threads-socket/lkp-bdw03
     25763 ±  4%     +14.3%      29437 ±  2%  hackbench/performance-50%-threads-socket/lkp-sbx04
      7897 ±  5%      -3.6%       7614 ±  1%  hackbench/performance-disable_mtrr_trim-1600%-process-pipe/loslunas
    231162 ±  3%      -5.6%     218130 ±  1%  hackbench/powersave-1600%-process-pipe/grantley
     90577 ±  3%     +78.7%     161906 ±  0%  hackbench/powersave-1600%-process-pipe/lituya
    188325 ±  0%      -2.4%     183750 ±  4%  hackbench/powersave-1600%-process-socket/grantley
     90775 ±  0%      +0.2%      90927 ±  0%  hackbench/powersave-1600%-process-socket/lituya
    226036 ±  3%      +5.9%     239462 ±  0%  hackbench/powersave-1600%-threads-pipe/grantley
     86586 ±  3%     +23.6%     107013 ±  4%  hackbench/powersave-1600%-threads-pipe/lituya
     88852 ±  0%      -0.6%      88356 ±  0%  hackbench/powersave-1600%-threads-socket/lituya
     50247 ±  3%     +16.4%      58483 ±  3%  hackbench/powersave-50%-process-pipe/grantley
     32052 ±  2%     +16.7%      37400 ±  2%  hackbench/powersave-50%-process-pipe/lituya
     38122 ±  3%     +32.0%      50304 ±  1%  hackbench/powersave-50%-process-socket/grantley
     23135 ±  0%     +17.1%      27099 ±  1%  hackbench/powersave-50%-process-socket/lituya
     48163 ±  6%     +13.0%      54417 ±  2%  hackbench/powersave-50%-threads-pipe/grantley
     29762 ±  2%     +19.4%      35533 ±  2%  hackbench/powersave-50%-threads-pipe/lituya
     36361 ±  2%     +27.9%      46518 ±  1%  hackbench/powersave-50%-threads-socket/grantley
     21857 ±  2%     +21.5%      26554 ±  0%  hackbench/powersave-50%-threads-socket/lituya
     31742            +9.2%      34675        GEO-MEAN hackbench.throughput

       base       4c59e142fee20a57617cff2250  
----------------  --------------------------  
       437 ±  0%      +0.0%        437 ±  0%  linpack/performance/ivb42
       437            +0.0%        437        GEO-MEAN linpack.GFlops

       base       4c59e142fee20a57617cff2250  
----------------  --------------------------  
     14004 ± 35%    +190.7%      40718 ± 44%  nepim/300s-100%-tcp/lkp-hsw01
     64809 ±  7%     -20.0%      51830 ± 30%  nepim/300s-25%-tcp/lkp-hsw01
     30127           +52.5%      45939        GEO-MEAN nepim.tcp.avg.kbps_out

       base       4c59e142fee20a57617cff2250  
----------------  --------------------------  
      8418 ± 26%     +77.0%      14902 ± 14%  nepim/300s-100%-tcp/lkp-hsw01
     62956 ±  0%      +0.2%      63079 ±  6%  nepim/300s-25%-tcp/lkp-hsw01
     23021           +33.2%      30659        GEO-MEAN nepim.tcp.avg.kbps_in

       base       4c59e142fee20a57617cff2250  
----------------  --------------------------  
    437326 ±  0%      +0.3%     438441 ±  2%  pft/performance-20x/t100
   1250886 ±  1%      +4.1%    1302202 ±  0%  pft/performance-disable_mtrr_trim-20x/loslunas
    739625            +2.2%     755605        GEO-MEAN pft.faults_per_sec_per_cpu

       base       4c59e142fee20a57617cff2250  
----------------  --------------------------  
  76748840 ±  0%      -0.4%   76421835 ±  0%  pigz/performance-100%-128K/xps2
  78299150 ±  0%      -0.1%   78239175 ±  0%  pigz/performance-100%-512K/xps2
  77520119            -0.3%   77325166        GEO-MEAN pigz.throughput

       base       4c59e142fee20a57617cff2250  
----------------  --------------------------  
        59 ±  0%      -0.8%         59 ±  1%  tlbflush/performance-disable_mtrr_trim-200%-32x-512/loslunas
        59            -0.8%         59        GEO-MEAN tlbflush.mem_acc_cost_ns_time

       base       4c59e142fee20a57617cff2250  
----------------  --------------------------
 1.069e+08 ±  0%      +1.2%  1.082e+08 ±  0%  vm-scalability/performance-300s-128G-truncate/lkp-hsx03
   3998112 ±  0%      +1.9%    4073056 ±  0%  vm-scalability/performance-300s-16G-shm-pread-rand-mt/lkp-hsx03
   7402406 ±  0%      -2.8%    7192847 ±  0%  vm-scalability/performance-300s-16G-shm-pread-rand/lkp-hsx03
   4034232 ±  0%      +0.7%    4060991 ±  0%  vm-scalability/performance-300s-16G-shm-xread-rand-mt/lkp-hsx03
   7386450 ±  0%      -2.0%    7239101 ±  0%  vm-scalability/performance-300s-16G-shm-xread-rand/lkp-hsx03
   7353904 ±  0%     -22.7%    5686546 ±  2%  vm-scalability/performance-300s-1T-msync-mt/lkp-hsx03
   3548358 ±  6%     +17.2%    4158294 ±  1%  vm-scalability/performance-300s-1T-msync/lkp-hsx03
  63746553 ±  1%     -30.6%   44236790 ±  1%  vm-scalability/performance-300s-1T-remap/lkp-hsx03
   8871391 ±  8%      -3.2%    8586294 ±  1%  vm-scalability/performance-300s-256G-lru-shm-rand/lkp-hsx03
  66418665 ±  1%      +0.6%   66786118 ±  0%  vm-scalability/performance-300s-2T-shm-pread-seq/lkp-hsx03
  22800096 ±  0%      +4.7%   23874726 ±  0%  vm-scalability/performance-300s-2T-shm-xread-seq-mt/lkp-hsx03
  65736378 ±  0%      +1.5%   66738487 ±  0%  vm-scalability/performance-300s-2T-shm-xread-seq/lkp-hsx03
   7284668 ±  0%      -3.3%    7047820 ±  0%  vm-scalability/performance-300s-512G-anon-cow-rand-mt/lkp-hsx03
   7031802 ±  0%      -1.1%    6957068 ±  0%  vm-scalability/performance-300s-512G-anon-cow-rand/lkp-hsx03
   7753328 ±  0%     -36.7%    4911510 ±  0%  vm-scalability/performance-300s-512G-anon-w-rand-mt/lkp-hsx03
   8087283 ±  0%      -7.7%    7462337 ±  0%  vm-scalability/performance-300s-512G-anon-w-rand/lkp-hsx03
   7421372 ±  0%      -3.9%    7133772 ±  0%  vm-scalability/performance-300s-512G-anon-wx-rand-mt/lkp-hsx03
  1.43e+08 ±  1%     -37.7%   89152611 ±  0%  vm-scalability/performance-300s-8T-anon-cow-seq-mt/lkp-hsx03
  56124630 ±  0%      +0.5%   56400980 ±  0%  vm-scalability/performance-300s-8T-anon-cow-seq/lkp-hsx03
  77349891 ±  0%     -83.0%   13113183 ±  0%  vm-scalability/performance-300s-8T-anon-w-seq-mt/lkp-hsx03
  79457514 ±  0%     +15.0%   91359278 ±  0%  vm-scalability/performance-300s-8T-anon-w-seq/lkp-hsx03
 2.082e+08 ±  0%     -42.4%  1.199e+08 ±  0%  vm-scalability/performance-300s-8T-anon-wx-seq-mt/lkp-hsx03
  28935505 ±  0%      -0.5%   28795862 ±  0%  vm-scalability/performance-300s-anon-r-rand-mt/lkp-hsx03
  29152977 ±  0%      -0.4%   29029194 ±  0%  vm-scalability/performance-300s-anon-r-rand/lkp-hsx03
  3.38e+08 ±  0%      +0.0%  3.381e+08 ±  0%  vm-scalability/performance-300s-anon-r-seq/lkp-hsx03
   1307431 ±  8%      -6.2%    1226648 ±  3%  vm-scalability/performance-300s-migrate/lkp-hsx03
   5576014 ±  0%      -2.4%    5440090 ±  0%  vm-scalability/performance-300s-mmap-pread-rand/lkp-hsx03
  66497365 ±  7%     -18.7%   54048662 ±  3%  vm-scalability/performance-300s-mmap-pread-seq-mt/lkp-hsx03
   5670638 ±  0%      -5.9%    5336539 ±  1%  vm-scalability/performance-300s-mmap-xread-rand-mt/lkp-hsx03
    579579 ±  0%     +58.0%     915838 ±  0%  vm-scalability/performance-300s-small-allocs-mt/lkp-hsx03
  13499657 ±  0%     +62.0%   21873828 ±  0%  vm-scalability/performance-300s-small-allocs/lkp-hsx03
  17221455            -9.6%   15561536        GEO-MEAN vm-scalability.throughput

       base       4c59e142fee20a57617cff2250  
----------------  --------------------------  
    349065 ±  0%      +0.1%     349410 ±  0%  will-it-scale/performance-page_fault1/brickland1
   1160020 ±  0%      +1.1%    1172370 ±  0%  will-it-scale/performance-pread1/brickland1
    192136 ±  0%      -0.2%     191687 ±  1%  will-it-scale/performance-pwrite3/brickland1
   2285829 ±  0%      +0.8%    2303678 ±  0%  will-it-scale/performance-sched_yield/brickland1
    614072 ±  0%      +0.3%     616197 ±  0%  will-it-scale/performance-write1/brickland1
    681030 ±  0%      +0.8%     686390 ±  0%  will-it-scale/performance-writeseek1/brickland1
      9979 ±  0%      +1.0%      10076 ±  0%  will-it-scale/powersave-context_switch1/brickland1
    507123 ±  0%      +1.7%     515533 ±  0%  will-it-scale/powersave-eventfd1/brickland1
    423596 ±  0%      +0.8%     427012 ±  0%  will-it-scale/powersave-lock1/brickland1
    150695 ±  0%      -0.6%     149846 ±  0%  will-it-scale/powersave-open1/brickland1
    246364 ±  0%      +0.2%     246894 ±  0%  will-it-scale/powersave-page_fault3/brickland1
    486939 ±  0%      +0.6%     489748 ±  0%  will-it-scale/powersave-pread1/brickland1
    341946 ±  0%      +0.5%     343542 ±  0%  will-it-scale/powersave-read1/brickland1
    345125            +0.5%     346970        GEO-MEAN will-it-scale.per_process_ops

       base       4c59e142fee20a57617cff2250  
----------------  --------------------------  
    252648 ±  0%      -0.5%     251385 ±  0%  will-it-scale/performance-page_fault1/brickland1
    992444 ±  0%      -1.0%     982934 ±  0%  will-it-scale/performance-pread1/brickland1
    179275 ±  0%      -1.3%     176938 ±  0%  will-it-scale/performance-pwrite3/brickland1
   2292824 ±  0%      +1.2%    2319663 ±  0%  will-it-scale/performance-sched_yield/brickland1
    528243 ±  0%      -2.2%     516867 ±  0%  will-it-scale/performance-write1/brickland1
    520857 ±  0%      -0.1%     520493 ±  0%  will-it-scale/performance-writeseek1/brickland1
      9838 ±  0%      +0.7%       9908 ±  0%  will-it-scale/powersave-context_switch1/brickland1
    416649 ±  0%      +0.6%     419258 ±  0%  will-it-scale/powersave-eventfd1/brickland1
     71754 ±  0%      +0.3%      71959 ±  0%  will-it-scale/powersave-lock1/brickland1
     27910 ±  0%      -3.1%      27047 ±  1%  will-it-scale/powersave-open1/brickland1
     63691 ±  0%      +0.1%      63781 ±  0%  will-it-scale/powersave-page_fault3/brickland1
    432771 ±  0%      -1.5%     426193 ±  0%  will-it-scale/powersave-pread1/brickland1
    294415 ±  0%      +0.5%     295995 ±  0%  will-it-scale/powersave-read1/brickland1
    213277            -0.5%     212255        GEO-MEAN will-it-scale.per_thread_ops

       base       4c59e142fee20a57617cff2250  
----------------  --------------------------  
      0.15 ±  1%      +0.7%       0.15 ±  0%  will-it-scale/performance-page_fault1/brickland1
      0.28 ±  0%      -0.0%       0.28 ±  0%  will-it-scale/performance-pread1/brickland1
      0.00 ±  0%      +2.5%       0.00 ±  3%  will-it-scale/performance-pwrite3/brickland1
      0.29 ±  0%      +2.4%       0.29 ±  0%  will-it-scale/performance-sched_yield/brickland1
      0.29 ±  0%      -1.0%       0.29 ±  0%  will-it-scale/performance-write1/brickland1
      0.30 ±  1%      +1.5%       0.30 ±  0%  will-it-scale/performance-writeseek1/brickland1
      0.10 ±  0%      +4.0%       0.11 ±  1%  will-it-scale/powersave-context_switch1/brickland1
      0.45 ±  1%      +2.9%       0.46 ±  0%  will-it-scale/powersave-eventfd1/brickland1
      0.43 ±  0%      +2.0%       0.43 ±  0%  will-it-scale/powersave-lock1/brickland1
      0.39 ±  0%      +6.5%       0.42 ±  0%  will-it-scale/powersave-open1/brickland1
      0.54 ±  0%      +0.1%       0.54 ±  0%  will-it-scale/powersave-page_fault3/brickland1
      0.43 ±  0%      -1.3%       0.42 ±  0%  will-it-scale/powersave-pread1/brickland1
      0.42 ±  1%      +0.7%       0.42 ±  1%  will-it-scale/powersave-read1/brickland1
      0.21            +1.6%       0.21        GEO-MEAN will-it-scale.scalability

       base       4c59e142fee20a57617cff2250  
----------------  --------------------------  
    740.44 ±  0%     -23.6%     565.47 ±  0%  aim7/performance-2000-fork_test/brickland3
    911.93 ±  0%      +0.1%     913.15 ±  0%  aim7/performance-3000-brk_test/brickland3
    665.34 ±  3%      -0.3%     663.11 ±  1%  aim7/performance-3000-disk_cp/brickland3
    650.55 ±  2%      -4.9%     618.41 ±  0%  aim7/performance-3000-disk_rd/brickland3
   1073.75 ±  0%      +1.7%    1092.35 ±  0%  aim7/performance-3000-sieve/brickland3
    864.93 ±  0%     -12.9%     753.22 ±  0%  aim7/performance-8000-disk_src/brickland3
    699.12 ±  9%      +2.7%     718.34 ±  9%  aim7/performance-8000-disk_wrt/brickland3
    989.51 ±  1%      -2.9%     960.95 ±  0%  aim7/performance-8000-mem_rtns_1/brickland3
    519.12 ±  0%      +7.9%     559.94 ±  0%  aim7/performance-8000-misc_rtns_1/brickland3
    963.28 ±  0%      -1.1%     952.83 ±  1%  aim7/performance-8000-sort_rtns_1/brickland3
    945.29 ±  0%      -1.3%     932.87 ±  0%  aim7/performance-8000-string_rtns/brickland3
     59.34 ±  0%      -0.1%      59.28 ±  0%  hackbench/performance-1600%-process-pipe/avoton3
    912.63 ±  2%      -1.9%     895.12 ±  2%  hackbench/performance-1600%-process-pipe/brickland3
    364.93 ±  0%      +0.3%     365.93 ±  0%  hackbench/performance-1600%-process-pipe/grantley
     59.77 ±  0%      +0.5%      60.08 ±  0%  hackbench/performance-1600%-process-socket/avoton3
    375.80 ±  0%      -0.9%     372.35 ±  0%  hackbench/performance-1600%-process-socket/grantley
     59.01 ±  0%      +1.3%      59.77 ±  2%  hackbench/performance-1600%-threads-pipe/avoton3
     59.73 ±  0%      -0.1%      59.69 ±  0%  hackbench/performance-1600%-threads-socket/avoton3
    358.04 ±  0%      -0.1%     357.80 ±  0%  hackbench/performance-50%-process-pipe/grantley
    364.00 ±  0%      +0.6%     366.35 ±  0%  hackbench/performance-50%-process-socket/grantley
    358.24 ±  0%      +0.1%     358.50 ±  0%  hackbench/performance-50%-threads-pipe/grantley
    364.29 ±  0%      +0.5%     366.25 ±  0%  hackbench/performance-50%-threads-socket/grantley
    363.82 ±  0%      -0.8%     360.83 ±  1%  hackbench/powersave-1600%-process-pipe/grantley
    367.31 ±  1%      -1.3%     362.50 ±  1%  hackbench/powersave-1600%-process-socket/grantley
    380.71 ±  0%      -0.6%     378.42 ±  0%  hackbench/powersave-1600%-threads-pipe/grantley
    357.92 ±  0%      +0.1%     358.33 ±  0%  hackbench/powersave-50%-process-pipe/grantley
    363.58 ±  0%      +0.7%     366.27 ±  0%  hackbench/powersave-50%-process-socket/grantley
    358.19 ±  0%      -0.1%     358.00 ±  0%  hackbench/powersave-50%-threads-pipe/grantley
    364.18 ±  0%      +0.5%     365.88 ±  0%  hackbench/powersave-50%-threads-socket/grantley
    328.19 ±  0%      +0.7%     330.62 ±  0%  kbuild/performance-200%/grantley
    321.69 ±  0%      -2.1%     314.78 ±  0%  kbuild/performance-50%/grantley
    314.89 ±  0%      -0.4%     313.63 ±  0%  kbuild/powersave-200%/grantley
    306.44 ±  0%      -2.6%     298.62 ±  0%  kbuild/powersave-50%/grantley
    384.93            -1.4%     379.71        GEO-MEAN pmeter.Average_Active_Power

       base       4c59e142fee20a57617cff2250  
----------------  --------------------------  
   8355840 ±  0%      +0.0%    8355840 ±  0%  boot/1/vm-kbuild-yocto-i386
  13492224 ±  0%      +0.0%   13492224 ±  0%  boot/1/vm-kbuild-yocto-ia32
  17936384 ±  0%      +0.0%   17936384 ±  0%  boot/1/vm-kbuild-yocto-ia32
  16461824 ±  0%      +0.0%   16461824 ±  0%  boot/1/vm-kbuild-yocto-ia32
   2068480 ±  0%      +0.0%    2068480 ±  0%  boot/1/vm-kbuild-yocto-ia32
  13492224 ±  0%      +0.0%   13492224 ±  0%  boot/1/vm-kbuild-yocto-x86_64
  17936384 ±  0%      +0.0%   17936384 ±  0%  boot/1/vm-kbuild-yocto-x86_64
   1802240 ±  0%      +0.0%    1802240 ±  0%  boot/1/vm-kbuild-yocto-x86_64
   8355840 ±  0%      +0.0%    8355840 ±  0%  boot/1/vm-vp-quantal-i386
   8575991            +0.0%    8575991        GEO-MEAN kernel-size.bss

       base       4c59e142fee20a57617cff2250  
----------------  --------------------------  
   3645184 ±  0%      -0.0%    3645120 ±  0%  boot/1/vm-kbuild-yocto-i386
   4087080 ±  0%      -0.0%    4087016 ±  0%  boot/1/vm-kbuild-yocto-ia32
   8292384 ±  0%      +0.0%    8292384 ±  0%  boot/1/vm-kbuild-yocto-ia32
   4809696 ±  0%      +0.0%    4809696 ±  0%  boot/1/vm-kbuild-yocto-ia32
   3183080 ±  0%      -0.0%    3183016 ±  0%  boot/1/vm-kbuild-yocto-ia32
   4087080 ±  0%      -0.0%    4087016 ±  0%  boot/1/vm-kbuild-yocto-x86_64
   8292384 ±  0%      +0.0%    8292384 ±  0%  boot/1/vm-kbuild-yocto-x86_64
   2718584 ±  0%      -0.0%    2718520 ±  0%  boot/1/vm-kbuild-yocto-x86_64
   3645184 ±  0%      -0.0%    3645120 ±  0%  boot/1/vm-vp-quantal-i386
   4413368            -0.0%    4413314        GEO-MEAN kernel-size.data

       base       4c59e142fee20a57617cff2250  
----------------  --------------------------  
   9076279 ±  0%      -0.0%    9076215 ±  0%  boot/1/vm-kbuild-yocto-i386
  19555593 ±  0%      -0.0%   19555201 ±  0%  boot/1/vm-kbuild-yocto-ia32
  48493597 ±  0%      +0.0%   48493597 ±  0%  boot/1/vm-kbuild-yocto-ia32
  19839225 ±  0%      -0.0%   19838841 ±  0%  boot/1/vm-kbuild-yocto-ia32
  13488401 ±  0%      +0.0%   13488785 ±  0%  boot/1/vm-kbuild-yocto-ia32
  19555593 ±  0%      -0.0%   19555201 ±  0%  boot/1/vm-kbuild-yocto-x86_64
  48493597 ±  0%      +0.0%   48493597 ±  0%  boot/1/vm-kbuild-yocto-x86_64
  15644016 ±  0%      +0.0%   15644208 ±  0%  boot/1/vm-kbuild-yocto-x86_64
   9076279 ±  0%      -0.0%    9076215 ±  0%  boot/1/vm-vp-quantal-i386
  18916465            -0.0%   18916396        GEO-MEAN kernel-size.text

       base       4c59e142fee20a57617cff2250  
----------------  --------------------------  
     23.69 ±  1%     -18.2%      19.37 ±  0%  aim7/performance-2000-fork_test/brickland3
     97.34 ±  0%      -0.1%      97.24 ±  0%  aim7/performance-3000-brk_test/brickland3
     31.50 ± 19%      +4.3%      32.87 ±  6%  aim7/performance-3000-disk_cp/brickland3
     31.31 ±  9%     -18.2%      25.61 ±  1%  aim7/performance-3000-disk_rd/brickland3
     95.45 ±  0%      -0.7%      94.81 ±  0%  aim7/performance-3000-sieve/brickland3
     53.69 ±  1%     -32.4%      36.27 ±  0%  aim7/performance-8000-disk_src/brickland3
     28.60 ± 24%      -6.4%      26.76 ±  4%  aim7/performance-8000-disk_wrt/brickland3
     62.51 ±  3%      -7.2%      58.02 ±  1%  aim7/performance-8000-mem_rtns_1/brickland3
      5.88 ±  1%      +6.6%       6.27 ±  0%  aim7/performance-8000-misc_rtns_1/brickland3
     62.94 ±  2%      -6.1%      59.09 ±  6%  aim7/performance-8000-sort_rtns_1/brickland3
     80.47 ±  0%      -7.2%      74.66 ±  1%  aim7/performance-8000-string_rtns/brickland3
     99.13 ±  0%      -1.0%      98.11 ±  0%  ebizzy/performance-200%-100x-10s/lkp-nex04
      2.31 ±  1%      -0.2%       2.31 ±  1%  fsmark/performance-1x-1t-1BRD_32G-ext4-4K-4G-fsyncBeforeClose-1fpd/ivb43
      2.26 ±  0%      +0.9%       2.29 ±  0%  fsmark/performance-1x-1t-1BRD_32G-ext4-nfsv4-4K-4G-fsyncBeforeClose-1fpd/ivb43
      1.96 ±  0%      -1.3%       1.94 ±  1%  fsmark/performance-1x-1t-1BRD_32G-xfs-4K-4G-fsyncBeforeClose-1fpd/ivb43
      2.45 ±  1%      -3.3%       2.37 ±  0%  fsmark/performance-1x-1t-1BRD_32G-xfs-nfsv4-4K-4G-fsyncBeforeClose-1fpd/ivb43
     56.88 ±  0%      +1.2%      57.58 ±  0%  ftq/performance-100%-20x-100000ss/lkp-nex06
     98.00 ±  0%      +1.3%      99.26 ±  0%  fwq/performance-disable_mtrr_trim-100%-20x-100000ss/loslunas
     94.66 ±  0%      -4.6%      90.32 ±  4%  hackbench/1600%-process-pipe/lkp-sb03
     94.27 ±  0%      -1.1%      93.27 ±  0%  hackbench/1600%-process-pipe/lkp-ws02
     98.34 ±  0%      -0.5%      97.83 ±  0%  hackbench/1600%-process-pipe/nhm-white
     95.18 ±  0%      -0.2%      95.02 ±  0%  hackbench/1600%-process-socket/lkp-sb03
     98.86 ±  0%      -0.2%      98.69 ±  0%  hackbench/1600%-process-socket/nhm-white
     98.20 ±  0%      -0.4%      97.76 ±  0%  hackbench/1600%-threads-pipe/lkp-sb03
     98.10 ±  0%      +0.1%      98.17 ±  0%  hackbench/1600%-threads-pipe/lkp-ws02
     99.26 ±  0%      -0.1%      99.21 ±  0%  hackbench/1600%-threads-pipe/nhm-white
     98.18 ±  0%      -0.2%      98.03 ±  0%  hackbench/1600%-threads-socket/lkp-sb03
     98.45 ±  0%      -0.1%      98.33 ±  0%  hackbench/1600%-threads-socket/lkp-ws02
     99.52 ±  0%      -0.3%      99.20 ±  0%  hackbench/50%-process-pipe/lkp-sb03
     99.06 ±  0%      +0.2%      99.22 ±  0%  hackbench/50%-process-socket/lkp-sb03
     99.52 ±  0%      -0.3%      99.27 ±  0%  hackbench/50%-threads-pipe/lkp-sb03
     99.35 ±  0%      -0.1%      99.23 ±  0%  hackbench/50%-threads-socket/lkp-sb03
     99.07 ±  0%      -0.3%      98.76 ±  0%  hackbench/performance-1600%-process-pipe/avoton3
     84.16 ±  5%      -4.9%      80.02 ±  4%  hackbench/performance-1600%-process-pipe/brickland3
     91.06 ±  0%      -2.1%      89.14 ±  0%  hackbench/performance-1600%-process-pipe/grantley
     91.67 ±  0%      -1.1%      90.70 ±  0%  hackbench/performance-1600%-process-pipe/ivb41
     91.66 ±  0%      -1.3%      90.49 ±  0%  hackbench/performance-1600%-process-pipe/ivb42
     97.84 ±  0%      -2.4%      95.46 ±  0%  hackbench/performance-1600%-process-pipe/lituya
     99.51 ±  0%      -0.5%      99.02 ±  0%  hackbench/performance-1600%-process-pipe/lkp-bdw02
     99.51 ±  0%      -0.1%      99.46 ±  0%  hackbench/performance-1600%-process-pipe/lkp-bdw03
     99.12            -0.3%      98.78 ±  0%  hackbench/performance-1600%-process-pipe/lkp-bsw01
     91.05 ±  0%      -1.0%      90.18 ±  0%  hackbench/performance-1600%-process-pipe/lkp-hsw01
     70.97 ± 12%     -21.6%      55.60 ± 13%  hackbench/performance-1600%-process-pipe/lkp-hsx02
     96.81 ±  0%      -1.5%      95.38 ±  0%  hackbench/performance-1600%-process-pipe/lkp-ne04
     86.74 ±  0%      -1.6%      85.38 ±  1%  hackbench/performance-1600%-process-pipe/lkp-nex04
     91.91 ±  0%      -5.3%      87.08 ±  0%  hackbench/performance-1600%-process-pipe/lkp-sbx04
     99.50 ±  0%      -0.0%      99.47 ±  0%  hackbench/performance-1600%-process-pipe/lkp-t410
     98.27 ±  0%      -0.6%      97.66 ±  0%  hackbench/performance-1600%-process-pipe/nhm4
     97.29 ±  0%      -0.9%      96.42 ±  0%  hackbench/performance-1600%-process-pipe/wsm
     99.39 ±  0%      -0.1%      99.28 ±  0%  hackbench/performance-1600%-process-socket/avoton3
     90.83 ±  0%      -0.5%      90.35 ±  1%  hackbench/performance-1600%-process-socket/grantley
     94.02 ±  0%      -0.4%      93.69        hackbench/performance-1600%-process-socket/ivb41
     94.01 ±  0%      -0.6%      93.41 ±  0%  hackbench/performance-1600%-process-socket/ivb42
     97.69 ±  0%      -0.2%      97.49 ±  0%  hackbench/performance-1600%-process-socket/lituya
     99.14 ±  0%      +0.1%      99.28 ±  0%  hackbench/performance-1600%-process-socket/lkp-bdw01
     99.46 ±  0%      -0.2%      99.28 ±  0%  hackbench/performance-1600%-process-socket/lkp-bdw02
     99.56 ±  0%      -0.1%      99.42 ±  0%  hackbench/performance-1600%-process-socket/lkp-bdw03
     97.52 ±  0%      -0.1%      97.37 ±  0%  hackbench/performance-1600%-process-socket/lkp-ne04
     91.52 ±  0%      -3.9%      87.91 ±  7%  hackbench/performance-1600%-process-socket/lkp-nex04
     99.60 ±  0%      -0.0%      99.58 ±  0%  hackbench/performance-1600%-process-socket/lkp-t410
     88.50 ±  0%      -0.5%      88.09 ±  0%  hackbench/performance-1600%-process-socket/lkp-wsx02
     98.81 ±  0%      -0.3%      98.52 ±  0%  hackbench/performance-1600%-process-socket/nhm4
     98.09 ±  0%      -0.2%      97.88 ±  0%  hackbench/performance-1600%-process-socket/wsm
     99.50 ±  0%      -0.0%      99.45 ±  0%  hackbench/performance-1600%-threads-pipe/avoton3
     99.20 ±  0%      -0.2%      99.05 ±  0%  hackbench/performance-1600%-threads-pipe/lituya
     99.73 ±  0%      -0.0%      99.68 ±  0%  hackbench/performance-1600%-threads-pipe/lkp-bdw01
     99.73 ±  0%      -0.1%      99.65 ±  0%  hackbench/performance-1600%-threads-pipe/lkp-bdw02
     99.68 ±  0%      +0.1%      99.73 ±  0%  hackbench/performance-1600%-threads-pipe/lkp-bdw03
     90.02 ±  3%      -0.3%      89.78 ±  0%  hackbench/performance-1600%-threads-pipe/lkp-hsx03
     98.90 ±  0%      -0.1%      98.80 ±  0%  hackbench/performance-1600%-threads-pipe/lkp-ne04
     99.77 ±  0%      -0.0%      99.76 ±  0%  hackbench/performance-1600%-threads-pipe/lkp-t410
     96.70 ±  0%      -1.8%      95.00 ±  0%  hackbench/performance-1600%-threads-pipe/lkp-wsx02
     99.30 ±  0%      -0.1%      99.17 ±  0%  hackbench/performance-1600%-threads-pipe/nhm4
     99.07 ±  0%      -0.1%      98.97 ±  0%  hackbench/performance-1600%-threads-pipe/wsm
     99.69 ±  0%      -0.0%      99.65 ±  0%  hackbench/performance-1600%-threads-socket/avoton3
     97.05 ±  1%      -0.1%      96.97 ±  0%  hackbench/performance-1600%-threads-socket/ivb42
     99.08 ±  0%      -1.6%      97.47 ±  1%  hackbench/performance-1600%-threads-socket/lituya
     99.69 ±  0%      -0.1%      99.58 ±  0%  hackbench/performance-1600%-threads-socket/lkp-bdw01
     99.69 ±  0%      -0.1%      99.62 ±  0%  hackbench/performance-1600%-threads-socket/lkp-bdw02
     99.63 ±  0%      +0.1%      99.71 ±  0%  hackbench/performance-1600%-threads-socket/lkp-bdw03
     98.98 ±  0%      -0.1%      98.90 ±  0%  hackbench/performance-1600%-threads-socket/lkp-ne04
     97.01 ±  0%      -0.3%      96.69 ±  0%  hackbench/performance-1600%-threads-socket/lkp-nex04
     99.78 ±  0%      +0.0%      99.80 ±  0%  hackbench/performance-1600%-threads-socket/lkp-t410
     96.31 ±  0%      -0.6%      95.69 ±  0%  hackbench/performance-1600%-threads-socket/lkp-wsx02
     99.19 ±  0%      -9.0%      90.26 ± 19%  hackbench/performance-1600%-threads-socket/nhm4
     99.22 ±  0%      -0.1%      99.08 ±  0%  hackbench/performance-1600%-threads-socket/wsm
     99.77 ±  0%      -0.1%      99.62 ±  0%  hackbench/performance-50%-process-pipe/brickland3
     99.46 ±  0%      -0.2%      99.30 ±  0%  hackbench/performance-50%-process-pipe/grantley
     99.45 ±  0%      -0.3%      99.11 ±  0%  hackbench/performance-50%-process-pipe/ivb42
     99.55 ±  0%      -0.2%      99.40 ±  0%  hackbench/performance-50%-process-pipe/lituya
     99.61 ±  0%      +0.1%      99.72 ±  0%  hackbench/performance-50%-process-pipe/lkp-bdw01
     99.74 ±  0%      -0.0%      99.74 ±  0%  hackbench/performance-50%-process-pipe/lkp-bdw02
     99.75 ±  0%      -0.1%      99.65 ±  0%  hackbench/performance-50%-process-pipe/lkp-bdw03
     99.47 ±  0%      -0.3%      99.21 ±  0%  hackbench/performance-50%-process-pipe/lkp-hsw01
     99.82 ±  0%      -0.1%      99.71 ±  0%  hackbench/performance-50%-process-pipe/lkp-hsx03
     99.76 ±  0%      -0.1%      99.62 ±  0%  hackbench/performance-50%-process-socket/brickland3
     99.50 ±  0%      -0.3%      99.20 ±  0%  hackbench/performance-50%-process-socket/grantley
     99.45 ±  0%      -0.3%      99.18 ±  0%  hackbench/performance-50%-process-socket/ivb42
     99.58 ±  0%      -1.1%      98.45 ±  2%  hackbench/performance-50%-process-socket/lituya
     99.75 ±  0%      -0.0%      99.72 ±  0%  hackbench/performance-50%-process-socket/lkp-bdw01
     99.76 ±  0%      -0.1%      99.70 ±  0%  hackbench/performance-50%-process-socket/lkp-bdw02
     99.77 ±  0%      -0.0%      99.76 ±  0%  hackbench/performance-50%-process-socket/lkp-bdw03
     99.40 ±  0%      -0.2%      99.17 ±  0%  hackbench/performance-50%-process-socket/lkp-hsw01
     99.81 ±  0%      -0.1%      99.71 ±  0%  hackbench/performance-50%-process-socket/lkp-hsx02
     99.80 ±  0%      -0.1%      99.69 ±  0%  hackbench/performance-50%-process-socket/lkp-hsx03
     99.61 ±  0%      -0.3%      99.30 ±  0%  hackbench/performance-50%-process-socket/lkp-sbx04
     99.76 ±  0%      -0.1%      99.63 ±  0%  hackbench/performance-50%-threads-pipe/brickland3
     99.52 ±  0%      -0.2%      99.36 ±  0%  hackbench/performance-50%-threads-pipe/grantley
     99.47 ±  0%      -0.3%      99.19 ±  0%  hackbench/performance-50%-threads-pipe/ivb42
     99.55 ±  0%      -6.7%      92.84 ±  8%  hackbench/performance-50%-threads-pipe/lituya
     99.74 ±  0%      +0.0%      99.74 ±  0%  hackbench/performance-50%-threads-pipe/lkp-bdw02
     99.76 ±  0%      -1.8%      97.99 ±  3%  hackbench/performance-50%-threads-pipe/lkp-bdw03
     99.52 ±  0%      -0.6%      98.92 ±  0%  hackbench/performance-50%-threads-pipe/lkp-hsw01
     99.84 ±  0%      -0.1%      99.72 ±  0%  hackbench/performance-50%-threads-pipe/lkp-hsx02
     99.52 ±  0%      -0.0%      99.49 ±  0%  hackbench/performance-50%-threads-pipe/lkp-sbx04
     99.78 ±  0%      -0.2%      99.61 ±  0%  hackbench/performance-50%-threads-socket/brickland3
     99.53 ±  0%      -0.2%      99.33 ±  0%  hackbench/performance-50%-threads-socket/grantley
     99.45 ±  0%      -0.3%      99.16 ±  0%  hackbench/performance-50%-threads-socket/ivb42
     99.57 ±  0%      -0.1%      99.46 ±  0%  hackbench/performance-50%-threads-socket/lituya
     99.75 ±  0%      +0.0%      99.77 ±  0%  hackbench/performance-50%-threads-socket/lkp-bdw01
     99.75 ±  0%      +0.0%      99.75 ±  0%  hackbench/performance-50%-threads-socket/lkp-bdw02
     99.78 ±  0%      +0.0%      99.79 ±  0%  hackbench/performance-50%-threads-socket/lkp-bdw03
     99.59 ±  0%      -0.2%      99.37 ±  0%  hackbench/performance-50%-threads-socket/lkp-sbx04
     98.98 ±  0%      -0.1%      98.91 ±  0%  hackbench/performance-disable_mtrr_trim-1600%-process-pipe/loslunas
     89.35 ±  0%      -1.4%      88.13 ±  3%  hackbench/powersave-1600%-process-pipe/grantley
     97.79 ±  0%      -2.1%      95.71 ±  0%  hackbench/powersave-1600%-process-pipe/lituya
     91.14 ±  0%      -0.4%      90.80 ±  0%  hackbench/powersave-1600%-process-socket/grantley
     97.64 ±  0%      -0.2%      97.48 ±  0%  hackbench/powersave-1600%-process-socket/lituya
     96.91 ±  0%      -0.4%      96.53 ±  0%  hackbench/powersave-1600%-threads-pipe/grantley
     99.23 ±  0%      -0.2%      99.02 ±  0%  hackbench/powersave-1600%-threads-pipe/lituya
     99.08 ±  0%      -0.1%      98.94 ±  0%  hackbench/powersave-1600%-threads-socket/lituya
     99.48 ±  0%      -0.2%      99.29 ±  0%  hackbench/powersave-50%-process-pipe/grantley
     99.42 ±  0%      -0.1%      99.36 ±  0%  hackbench/powersave-50%-process-pipe/lituya
     99.24 ±  0%      +0.0%      99.28 ±  0%  hackbench/powersave-50%-process-socket/grantley
     99.59 ±  0%      -0.1%      99.54 ±  0%  hackbench/powersave-50%-process-socket/lituya
     99.52 ±  0%      -0.2%      99.35 ±  0%  hackbench/powersave-50%-threads-pipe/grantley
     99.56 ±  0%      -0.4%      99.17 ±  0%  hackbench/powersave-50%-threads-pipe/lituya
     99.53 ±  0%      -0.2%      99.29 ±  0%  hackbench/powersave-50%-threads-socket/grantley
     99.59 ±  0%      -0.0%      99.55 ±  0%  hackbench/powersave-50%-threads-socket/lituya
     86.22 ±  0%      -1.4%      84.98 ±  0%  hpcc/performance/lkp-hsx03
     94.12 ±  0%      -1.3%      92.86 ±  0%  hpcc/performance/lkp-sbx04
     62.84 ±  0%      +0.1%      62.89 ±  0%  kbuild/performance-200%/grantley
     36.73 ±  0%      -2.8%      35.72 ±  0%  kbuild/performance-50%/grantley
     56.52 ±  1%      -2.5%      55.13 ±  1%  kbuild/powersave-200%/grantley
     33.23 ±  0%      -2.2%      32.52 ±  0%  kbuild/powersave-50%/grantley
     69.91 ±  0%      -0.0%      69.90 ±  0%  linpack/performance/ivb42
      1.47 ±  0%     +38.0%       2.02 ±  2%  nepim/300s-100%-tcp/lkp-hsw01
      0.68            -6.6%       0.64 ±  2%  nepim/300s-100%-tcp6/lkp-hsw01
      0.57 ±  5%      -0.0%       0.57 ±  0%  nepim/300s-100%-udp/lkp-hsw01
      1.74 ±  1%      +1.7%       1.77 ±  1%  nepim/300s-100%-udp6/lkp-hsw01
      0.98 ±  0%     +14.3%       1.12 ± 13%  nepim/300s-25%-tcp/lkp-hsw01
      0.60 ±  1%      +5.8%       0.64 ±  5%  nepim/300s-25%-tcp6/lkp-hsw01
      0.45 ±  5%      -1.1%       0.45 ±  0%  nepim/300s-25%-udp/lkp-hsw01
      1.14 ±  1%      -2.2%       1.12 ±  0%  nepim/300s-25%-udp6/lkp-hsw01
      3.62 ±  3%      -3.0%       3.51 ±  2%  netpipe/performance-tcp/ivb42
      0.18 ±  2%      -5.4%       0.17 ±  2%  nuttcp/300s/lkp-hsw01
     99.34 ±  0%      -0.6%      98.75 ±  0%  pbzip2/performance-100%-500K/ivb42
     99.25 ±  0%      -1.2%      98.02 ±  0%  pbzip2/performance-100%-900K/ivb42
     25.21 ±  0%      +0.0%      25.21 ±  0%  pbzip2/performance-25%-500K/ivb42
     25.19 ±  0%      +0.0%      25.19 ±  0%  pbzip2/performance-25%-900K/ivb42
     34.45 ±  3%     +26.8%      43.70 ± 23%  pft/performance-20x/t100
     53.06 ±  0%      -2.7%      51.64 ±  0%  pft/performance-disable_mtrr_trim-20x/loslunas
     99.47 ±  0%      -0.4%      99.03 ±  0%  pigz/performance-100%-128K/xps2
     99.47 ±  0%      -0.6%      98.86 ±  0%  pigz/performance-100%-512K/xps2
     93.03 ±  0%      -1.2%      91.88 ±  0%  pixz/performance-100%/lkp-wsx02
     93.64 ±  0%      -1.0%      92.75 ±  0%  plzip/performance-100%/lkp-sbx04
     24.73 ±  0%      -0.5%      24.61 ±  0%  plzip/performance-25%/lkp-sbx04
     26.02 ±  0%      +0.0%      26.03 ±  0%  tcrypt/performance-2s-200-204/avoton1
     25.99 ±  0%      +0.0%      26.01 ±  0%  tcrypt/performance-2s-205-210/avoton1
     25.98 ±  0%      +0.3%      26.05 ±  0%  tcrypt/performance-2s-301-319/avoton1
     25.99 ±  0%      +0.1%      26.02 ±  0%  tcrypt/performance-2s-401-417/avoton1
     26.00 ±  0%      +0.1%      26.02 ±  0%  tcrypt/performance-2s-500-504/avoton1
     15.13 ±  0%     +13.5%      17.17 ±  6%  tlbflush/performance-disable_mtrr_trim-200%-32x-512/loslunas
     38.62 ± 22%     +22.1%      47.15 ±  0%  unixbench/performance-disable_mtrr_trim-context1/loslunas
     44.00 ±  0%      -0.4%      43.84 ±  0%  unixbench/performance-disable_mtrr_trim-dhry2reg/loslunas
     46.31 ±  0%     -51.0%      22.70 ±  1%  unixbench/performance-disable_mtrr_trim-execl/loslunas
     44.13 ±  0%      -0.7%      43.81 ±  0%  unixbench/performance-disable_mtrr_trim-pipe/loslunas
     52.62 ±  0%     -39.8%      31.69 ±  1%  unixbench/performance-disable_mtrr_trim-shell1/loslunas
     66.97 ±  0%     -13.4%      58.03 ±  1%  unixbench/performance-disable_mtrr_trim-shell8/loslunas
     45.66 ±  0%     -57.1%      19.59 ±  0%  unixbench/performance-disable_mtrr_trim-spawn/loslunas
     44.04 ±  0%      -0.5%      43.82 ±  0%  unixbench/performance-disable_mtrr_trim-syscall/loslunas
     46.36 ±  0%      -0.9%      45.92 ±  0%  unixbench/performance-disable_mtrr_trim-whetstone-double/loslunas
      1.90 ±  6%      -3.7%       1.82 ±  4%  vm-scalability/performance-300s-128G-truncate/lkp-hsx03
     50.87 ±  0%     -17.0%      42.20 ±  0%  vm-scalability/performance-300s-16G-shm-pread-rand-mt/lkp-hsx03
     95.88 ±  0%      -0.7%      95.22 ±  0%  vm-scalability/performance-300s-16G-shm-pread-rand/lkp-hsx03
     62.81 ±  1%     -17.5%      51.80 ±  0%  vm-scalability/performance-300s-16G-shm-xread-rand-mt/lkp-hsx03
     95.86 ±  0%      -0.9%      95.03 ±  0%  vm-scalability/performance-300s-16G-shm-xread-rand/lkp-hsx03
     27.98 ±  0%      -8.2%      25.68 ±  0%  vm-scalability/performance-300s-1T-msync-mt/lkp-hsx03
      2.77 ±  0%      +2.9%       2.85 ±  6%  vm-scalability/performance-300s-1T-msync/lkp-hsx03
     11.18 ±  1%     -25.6%       8.32 ±  1%  vm-scalability/performance-300s-1T-remap/lkp-hsx03
     45.14 ±  8%      -6.5%      42.20 ±  1%  vm-scalability/performance-300s-256G-lru-shm-rand/lkp-hsx03
     70.17 ±  0%      -1.2%      69.34 ±  0%  vm-scalability/performance-300s-2T-shm-pread-seq/lkp-hsx03
     22.26 ±  0%      -3.9%      21.39 ±  0%  vm-scalability/performance-300s-2T-shm-xread-seq-mt/lkp-hsx03
     69.82 ±  0%      +0.3%      70.00 ±  0%  vm-scalability/performance-300s-2T-shm-xread-seq/lkp-hsx03
     92.93 ±  0%      -1.9%      91.17 ±  0%  vm-scalability/performance-300s-512G-anon-cow-rand-mt/lkp-hsx03
     88.34 ±  0%      -3.4%      85.34 ±  0%  vm-scalability/performance-300s-512G-anon-cow-rand/lkp-hsx03
     72.98 ±  3%     -11.9%      64.27 ±  0%  vm-scalability/performance-300s-512G-anon-w-rand-mt/lkp-hsx03
     85.16 ±  0%      -8.8%      77.66 ±  0%  vm-scalability/performance-300s-512G-anon-w-rand/lkp-hsx03
     91.75 ±  0%      -5.0%      87.14 ±  3%  vm-scalability/performance-300s-512G-anon-wx-rand-mt/lkp-hsx03
     65.04 ±  0%     -11.6%      57.47 ±  4%  vm-scalability/performance-300s-8T-anon-cow-seq-mt/lkp-hsx03
     59.20 ±  0%      +5.3%      62.36 ±  0%  vm-scalability/performance-300s-8T-anon-cow-seq/lkp-hsx03
     45.38 ±  0%     -85.1%       6.77 ±  3%  vm-scalability/performance-300s-8T-anon-w-seq-mt/lkp-hsx03
     87.53 ±  0%     -23.9%      66.64 ±  0%  vm-scalability/performance-300s-8T-anon-w-seq/lkp-hsx03
     76.49 ±  2%     -14.9%      65.08 ±  1%  vm-scalability/performance-300s-8T-anon-wx-seq-mt/lkp-hsx03
     95.39 ±  0%      -0.4%      94.97 ±  0%  vm-scalability/performance-300s-anon-r-rand-mt/lkp-hsx03
     93.36 ±  1%      +3.3%      96.45 ±  0%  vm-scalability/performance-300s-anon-r-rand/lkp-hsx03
     98.25 ±  0%      -1.5%      96.80 ±  0%  vm-scalability/performance-300s-anon-r-seq/lkp-hsx03
      0.75 ±  1%      +0.0%       0.75 ±  0%  vm-scalability/performance-300s-migrate/lkp-hsx03
     98.38 ±  0%      -0.3%      98.10 ±  0%  vm-scalability/performance-300s-mmap-pread-rand/lkp-hsx03
     64.52 ±  3%     +11.4%      71.88 ±  0%  vm-scalability/performance-300s-mmap-pread-seq-mt/lkp-hsx03
     91.45 ±  0%      -0.0%      91.43 ±  0%  vm-scalability/performance-300s-mmap-xread-rand-mt/lkp-hsx03
      5.96 ±  0%     +21.6%       7.25 ±  0%  vm-scalability/performance-300s-small-allocs-mt/lkp-hsx03
      5.89 ±  0%     +16.4%       6.86 ±  0%  vm-scalability/performance-300s-small-allocs/lkp-hsx03
     33.42 ±  0%      -0.4%      33.28 ±  0%  will-it-scale/performance-page_fault1/brickland1
     42.73 ±  0%      -0.2%      42.65 ±  0%  will-it-scale/performance-pread1/brickland1
     41.19 ±  0%      -0.2%      41.09 ±  0%  will-it-scale/performance-pwrite3/brickland1
     42.70 ±  0%      -0.1%      42.66 ±  0%  will-it-scale/performance-sched_yield/brickland1
     42.72 ±  0%      -0.2%      42.61 ±  0%  will-it-scale/performance-write1/brickland1
     42.73 ±  0%      -0.2%      42.65 ±  0%  will-it-scale/performance-writeseek1/brickland1
     52.42 ±  0%      +1.6%      53.24 ±  0%  will-it-scale/powersave-context_switch1/brickland1
     42.34 ±  0%      -0.2%      42.24 ±  0%  will-it-scale/powersave-eventfd1/brickland1
     42.34 ±  0%      -0.3%      42.20 ±  0%  will-it-scale/powersave-lock1/brickland1
     44.15 ±  0%      -8.3%      40.47 ±  0%  will-it-scale/powersave-open1/brickland1
     33.66 ±  0%      +0.3%      33.74 ±  0%  will-it-scale/powersave-page_fault3/brickland1
     42.34 ±  0%      -0.3%      42.23 ±  0%  will-it-scale/powersave-pread1/brickland1
     42.27 ±  0%      -0.2%      42.19 ±  0%  will-it-scale/powersave-read1/brickland1
     53.37            -3.0%      51.80        GEO-MEAN turbostat.%Busy

       base       4c59e142fee20a57617cff2250  
----------------  --------------------------  
       756 ±  1%     -18.3%        617 ±  0%  aim7/performance-2000-fork_test/brickland3
      3107 ±  0%      -0.1%       3104 ±  0%  aim7/performance-3000-brk_test/brickland3
      1006 ± 19%      +4.3%       1049 ±  6%  aim7/performance-3000-disk_cp/brickland3
      1000 ±  9%     -18.2%        818 ±  1%  aim7/performance-3000-disk_rd/brickland3
      3047 ±  0%      -0.7%       3027 ±  0%  aim7/performance-3000-sieve/brickland3
      1714 ±  1%     -32.5%       1157 ±  0%  aim7/performance-8000-disk_src/brickland3
       913 ± 24%      -6.5%        854 ±  4%  aim7/performance-8000-disk_wrt/brickland3
      1995 ±  3%      -7.2%       1852 ±  1%  aim7/performance-8000-mem_rtns_1/brickland3
       187 ±  1%      +6.4%        199 ±  0%  aim7/performance-8000-misc_rtns_1/brickland3
      2009 ±  2%      -6.1%       1886 ±  6%  aim7/performance-8000-sort_rtns_1/brickland3
      2568 ±  0%      -7.2%       2382 ±  1%  aim7/performance-8000-string_rtns/brickland3
      2246 ±  0%      -0.8%       2229 ±  0%  ebizzy/performance-200%-100x-10s/lkp-nex04
        76 ±  1%      -0.7%         76 ±  1%  fsmark/performance-1x-1t-1BRD_32G-ext4-4K-4G-fsyncBeforeClose-1fpd/ivb43
        68 ±  0%      -0.7%         68 ±  1%  fsmark/performance-1x-1t-1BRD_32G-ext4-nfsv4-4K-4G-fsyncBeforeClose-1fpd/ivb43
        64 ±  0%      -0.8%         63 ±  2%  fsmark/performance-1x-1t-1BRD_32G-xfs-4K-4G-fsyncBeforeClose-1fpd/ivb43
        73 ±  2%      -4.1%         70 ±  0%  fsmark/performance-1x-1t-1BRD_32G-xfs-nfsv4-4K-4G-fsyncBeforeClose-1fpd/ivb43
      1176 ±  0%      +1.2%       1190 ±  0%  ftq/performance-100%-20x-100000ss/lkp-nex06
      2347 ±  0%      +1.3%       2377 ±  0%  fwq/performance-disable_mtrr_trim-100%-20x-100000ss/loslunas
      2926 ±  0%      -4.6%       2791 ±  4%  hackbench/1600%-process-pipe/lkp-sb03
      2767 ±  0%      -1.3%       2732 ±  0%  hackbench/1600%-process-pipe/lkp-ws02
      2877 ±  0%      -0.5%       2862 ±  0%  hackbench/1600%-process-pipe/nhm-white
      2941 ±  0%      -0.9%       2914 ±  0%  hackbench/1600%-process-socket/lkp-sb03
      2892 ±  0%      -0.2%       2887 ±  0%  hackbench/1600%-process-socket/nhm-white
      3036 ±  0%      -0.5%       3022 ±  0%  hackbench/1600%-threads-pipe/lkp-sb03
      2867 ±  0%      -0.5%       2853 ±  0%  hackbench/1600%-threads-pipe/lkp-ws02
      2904 ±  0%      -0.1%       2902 ±  0%  hackbench/1600%-threads-pipe/nhm-white
      3034 ±  0%      -0.5%       3018 ±  0%  hackbench/1600%-threads-socket/lkp-sb03
      2941 ±  0%      -0.9%       2915 ±  0%  hackbench/1600%-threads-socket/lkp-ws02
      3077 ±  0%      -0.3%       3067 ±  0%  hackbench/50%-process-pipe/lkp-sb03
      3063 ±  0%      +0.2%       3068 ±  0%  hackbench/50%-process-socket/lkp-sb03
      3077 ±  0%      -0.3%       3069 ±  0%  hackbench/50%-threads-pipe/lkp-sb03
      3072 ±  0%      -0.1%       3068 ±  0%  hackbench/50%-threads-socket/lkp-sb03
      2279 ±  0%      -0.3%       2271 ±  0%  hackbench/performance-1600%-process-pipe/avoton3
      2686 ±  5%      -4.9%       2554 ±  4%  hackbench/performance-1600%-process-pipe/brickland3
      2362 ±  0%      -2.1%       2312 ±  0%  hackbench/performance-1600%-process-pipe/grantley
      2744 ±  0%      -1.1%       2714 ±  0%  hackbench/performance-1600%-process-pipe/ivb41
      2743 ±  0%      -1.3%       2708 ±  0%  hackbench/performance-1600%-process-pipe/ivb42
      3221 ±  0%      -2.4%       3143 ±  0%  hackbench/performance-1600%-process-pipe/lituya
      2283 ±  0%      -0.5%       2272 ±  0%  hackbench/performance-1600%-process-pipe/lkp-bdw02
      1489 ±  0%      -0.1%       1488 ±  0%  hackbench/performance-1600%-process-pipe/lkp-bdw03
      2220            -0.3%       2212 ±  0%  hackbench/performance-1600%-process-pipe/lkp-bsw01
      2784 ±  0%      -1.0%       2757 ±  0%  hackbench/performance-1600%-process-pipe/lkp-hsw01
      2052 ± 12%     -21.6%       1608 ± 13%  hackbench/performance-1600%-process-pipe/lkp-hsx02
      3081 ±  0%      -1.6%       3030 ±  0%  hackbench/performance-1600%-process-pipe/lkp-ne04
      2047 ±  0%      -1.8%       2010 ±  1%  hackbench/performance-1600%-process-pipe/lkp-nex04
      2659 ±  0%      -5.3%       2519 ±  0%  hackbench/performance-1600%-process-pipe/lkp-sbx04
      1190 ±  0%      -0.0%       1190 ±  0%  hackbench/performance-1600%-process-pipe/lkp-t410
      3275 ±  0%      -0.6%       3255 ±  0%  hackbench/performance-1600%-process-pipe/nhm4
      3418 ±  0%      -0.9%       3387 ±  0%  hackbench/performance-1600%-process-pipe/wsm
      2286 ±  0%      -0.1%       2283 ±  0%  hackbench/performance-1600%-process-socket/avoton3
      2356 ±  0%      -0.5%       2343 ±  1%  hackbench/performance-1600%-process-socket/grantley
      2814 ±  0%      -0.4%       2804        hackbench/performance-1600%-process-socket/ivb41
      2814 ±  0%      -0.6%       2795 ±  0%  hackbench/performance-1600%-process-socket/ivb42
      3216 ±  0%      -0.2%       3209 ±  0%  hackbench/performance-1600%-process-socket/lituya
      2274 ±  0%      +0.1%       2278 ±  0%  hackbench/performance-1600%-process-socket/lkp-bdw01
      2282 ±  0%      -0.2%       2278 ±  0%  hackbench/performance-1600%-process-socket/lkp-bdw02
      1489 ±  0%      -0.1%       1487 ±  0%  hackbench/performance-1600%-process-socket/lkp-bdw03
      3113 ±  0%      -0.2%       3108 ±  0%  hackbench/performance-1600%-process-socket/lkp-ne04
      2169 ±  0%      -4.1%       2080 ±  7%  hackbench/performance-1600%-process-socket/lkp-nex04
      1192 ±  0%      -0.0%       1192 ±  0%  hackbench/performance-1600%-process-socket/lkp-t410
      2233 ±  0%      -0.6%       2220 ±  0%  hackbench/performance-1600%-process-socket/lkp-wsx02
      3293 ±  0%      -0.3%       3283 ±  0%  hackbench/performance-1600%-process-socket/nhm4
      3446 ±  0%      -0.2%       3439 ±  0%  hackbench/performance-1600%-process-socket/wsm
      2288 ±  0%      -0.1%       2287 ±  0%  hackbench/performance-1600%-threads-pipe/avoton3
      3266 ±  0%      -0.1%       3261 ±  0%  hackbench/performance-1600%-threads-pipe/lituya
      2288 ±  0%      -0.0%       2287 ±  0%  hackbench/performance-1600%-threads-pipe/lkp-bdw01
      2288 ±  0%      -0.1%       2286 ±  0%  hackbench/performance-1600%-threads-pipe/lkp-bdw02
      1491 ±  0%      +0.1%       1492 ±  0%  hackbench/performance-1600%-threads-pipe/lkp-bdw03
      2424 ±  3%      -0.3%       2418 ±  0%  hackbench/performance-1600%-threads-pipe/lkp-hsx03
      3138 ±  0%      -0.1%       3135 ±  0%  hackbench/performance-1600%-threads-pipe/lkp-ne04
      1194 ±  0%      +0.0%       1194 ±  0%  hackbench/performance-1600%-threads-pipe/lkp-t410
      2442 ±  0%      -1.8%       2399 ±  0%  hackbench/performance-1600%-threads-pipe/lkp-wsx02
      3310 ±  0%      -0.1%       3305 ±  0%  hackbench/performance-1600%-threads-pipe/nhm4
      3480 ±  0%      -0.1%       3477 ±  0%  hackbench/performance-1600%-threads-pipe/wsm
      2293 ±  0%      -0.0%       2292 ±  0%  hackbench/performance-1600%-threads-socket/avoton3
      2904 ±  1%      -0.1%       2901 ±  0%  hackbench/performance-1600%-threads-socket/ivb42
      3262 ±  0%      -1.6%       3209 ±  1%  hackbench/performance-1600%-threads-socket/lituya
      2287 ±  0%      -0.1%       2284 ±  0%  hackbench/performance-1600%-threads-socket/lkp-bdw01
      2287 ±  0%      -0.1%       2286 ±  0%  hackbench/performance-1600%-threads-socket/lkp-bdw02
      1080 ± 31%     +13.0%       1221 ± 27%  hackbench/performance-1600%-threads-socket/lkp-bdw03
      3159 ±  0%      -0.1%       3156 ±  0%  hackbench/performance-1600%-threads-socket/lkp-ne04
      2303 ±  0%      -0.4%       2293 ±  0%  hackbench/performance-1600%-threads-socket/lkp-nex04
      1194 ±  0%      +0.1%       1195 ±  0%  hackbench/performance-1600%-threads-socket/lkp-t410
      2431 ±  0%      -0.7%       2414 ±  0%  hackbench/performance-1600%-threads-socket/lkp-wsx02
      3306 ±  0%      -9.0%       3008 ± 19%  hackbench/performance-1600%-threads-socket/nhm4
      3485 ±  0%      -0.1%       3480 ±  0%  hackbench/performance-1600%-threads-socket/wsm
      3185 ±  0%      -0.2%       3180 ±  0%  hackbench/performance-50%-process-pipe/brickland3
      2579 ±  0%      -0.1%       2576 ±  0%  hackbench/performance-50%-process-pipe/grantley
      2976 ±  0%      -0.3%       2966 ±  0%  hackbench/performance-50%-process-pipe/ivb42
      3277 ±  0%      -0.2%       3272 ±  0%  hackbench/performance-50%-process-pipe/lituya
      2285 ±  0%      +0.1%       2288 ±  0%  hackbench/performance-50%-process-pipe/lkp-bdw01
      2288 ±  0%      +0.0%       2288 ±  0%  hackbench/performance-50%-process-pipe/lkp-bdw02
      1492 ±  0%      -9.4%       1353 ± 20%  hackbench/performance-50%-process-pipe/lkp-bdw03
      3076 ±  0%      -0.3%       3068 ±  0%  hackbench/performance-50%-process-pipe/lkp-hsw01
      2688 ±  0%      -0.1%       2685 ±  0%  hackbench/performance-50%-process-pipe/lkp-hsx03
      3184 ±  0%      -0.1%       3180 ±  0%  hackbench/performance-50%-process-socket/brickland3
      2581 ±  0%      -0.3%       2573 ±  0%  hackbench/performance-50%-process-socket/grantley
      2976 ±  0%      -0.3%       2968 ±  0%  hackbench/performance-50%-process-socket/ivb42
      3278 ±  0%      -1.1%       3241 ±  2%  hackbench/performance-50%-process-socket/lituya
      2289 ±  0%      -0.0%       2288 ±  0%  hackbench/performance-50%-process-socket/lkp-bdw01
      2289 ±  0%      -0.1%       2287 ±  0%  hackbench/performance-50%-process-socket/lkp-bdw02
      1493 ±  0%      -0.0%       1493 ±  0%  hackbench/performance-50%-process-socket/lkp-bdw03
      3071 ±  0%      -0.5%       3055 ±  0%  hackbench/performance-50%-process-socket/lkp-hsw01
      2887 ±  0%      -0.1%       2884 ±  0%  hackbench/performance-50%-process-socket/lkp-hsx02
      2688 ±  0%      -0.1%       2685 ±  0%  hackbench/performance-50%-process-socket/lkp-hsx03
      2882 ±  0%      -0.3%       2873 ±  0%  hackbench/performance-50%-process-socket/lkp-sbx04
      3184 ±  0%      -0.1%       3180 ±  0%  hackbench/performance-50%-threads-pipe/brickland3
      2581 ±  0%      -0.2%       2577 ±  0%  hackbench/performance-50%-threads-pipe/grantley
      2977 ±  0%      -0.3%       2968 ±  0%  hackbench/performance-50%-threads-pipe/ivb42
      3277 ±  0%      -6.7%       3056 ±  8%  hackbench/performance-50%-threads-pipe/lituya
      2288 ±  0%      +0.0%       2289 ±  0%  hackbench/performance-50%-threads-pipe/lkp-bdw02
      1493 ±  0%      -1.8%       1466 ±  3%  hackbench/performance-50%-threads-pipe/lkp-bdw03
      3077 ±  0%      -0.6%       3059 ±  0%  hackbench/performance-50%-threads-pipe/lkp-hsw01
      2888 ±  0%      -0.1%       2885 ±  0%  hackbench/performance-50%-threads-pipe/lkp-hsx02
      2879 ±  0%      -0.0%       2878 ±  0%  hackbench/performance-50%-threads-pipe/lkp-sbx04
      3185 ±  0%      -0.2%       3179 ±  0%  hackbench/performance-50%-threads-socket/brickland3
      2581 ±  0%      -0.2%       2576 ±  0%  hackbench/performance-50%-threads-socket/grantley
      2976 ±  0%      -0.3%       2967 ±  0%  hackbench/performance-50%-threads-socket/ivb42
      3278 ±  0%      -0.1%       3274 ±  0%  hackbench/performance-50%-threads-socket/lituya
      2288 ±  0%      +0.0%       2289 ±  0%  hackbench/performance-50%-threads-socket/lkp-bdw01
      2289 ±  0%      +0.0%       2289 ±  0%  hackbench/performance-50%-threads-socket/lkp-bdw02
      1493 ±  0%      +0.0%       1493 ±  0%  hackbench/performance-50%-threads-socket/lkp-bdw03
      2881 ±  0%      -0.2%       2874 ±  0%  hackbench/performance-50%-threads-socket/lkp-sbx04
      2370 ±  0%      -0.1%       2368 ±  0%  hackbench/performance-disable_mtrr_trim-1600%-process-pipe/loslunas
      2311 ±  0%      -1.4%       2279 ±  3%  hackbench/powersave-1600%-process-pipe/grantley
      3208 ±  0%      -2.1%       3140 ±  0%  hackbench/powersave-1600%-process-pipe/lituya
      2287 ±  1%      -1.6%       2251 ±  1%  hackbench/powersave-1600%-process-socket/grantley
      3201 ±  0%      -0.1%       3199 ±  0%  hackbench/powersave-1600%-process-socket/lituya
      2508 ±  0%      -0.4%       2497 ±  0%  hackbench/powersave-1600%-threads-pipe/grantley
      3262 ±  0%      -0.2%       3255 ±  0%  hackbench/powersave-1600%-threads-pipe/lituya
      3253 ±  0%      -0.1%       3250 ±  0%  hackbench/powersave-1600%-threads-socket/lituya
      2579 ±  0%      -0.2%       2575 ±  0%  hackbench/powersave-50%-process-pipe/grantley
      3272 ±  0%      -0.0%       3270 ±  0%  hackbench/powersave-50%-process-pipe/lituya
      2573 ±  0%      +0.0%       2575 ±  0%  hackbench/powersave-50%-process-socket/grantley
      3278 ±  0%      -0.1%       3276 ±  0%  hackbench/powersave-50%-process-socket/lituya
      2581 ±  0%      -0.2%       2576 ±  0%  hackbench/powersave-50%-threads-pipe/grantley
      3277 ±  0%      -0.4%       3263 ±  0%  hackbench/powersave-50%-threads-pipe/lituya
      2581 ±  0%      -0.3%       2574 ±  0%  hackbench/powersave-50%-threads-socket/grantley
      3278 ±  0%      -0.1%       3276 ±  0%  hackbench/powersave-50%-threads-socket/lituya
      2322 ±  0%      -1.4%       2289 ±  0%  hpcc/performance/lkp-hsx03
      2723 ±  0%      -1.3%       2686 ±  0%  hpcc/performance/lkp-sbx04
      1629 ±  0%      +0.1%       1631 ±  0%  kbuild/performance-200%/grantley
       952 ±  0%      -2.7%        926 ±  0%  kbuild/performance-50%/grantley
      1453 ±  0%      -2.4%       1418 ±  1%  kbuild/powersave-200%/grantley
       845 ±  0%      -3.4%        816 ±  0%  kbuild/powersave-50%/grantley
      2065 ±  0%      -0.0%       2065 ±  0%  linpack/performance/ivb42
        24 ±  1%     +34.5%         33 ±  7%  nepim/300s-100%-tcp/lkp-hsw01
         9            +0.0%          9 ±  0%  nepim/300s-100%-tcp6/lkp-hsw01
         8 ±  5%      +0.0%          8 ±  5%  nepim/300s-100%-udp/lkp-hsw01
        49 ±  0%      +0.0%         49 ±  2%  nepim/300s-100%-udp6/lkp-hsw01
        13 ±  0%     +19.2%         15 ± 16%  nepim/300s-25%-tcp/lkp-hsw01
         8 ±  5%     +11.8%          9 ±  5%  nepim/300s-25%-tcp6/lkp-hsw01
         6 ±  7%      -7.7%          6 ±  0%  nepim/300s-25%-udp/lkp-hsw01
        22 ±  0%      +0.0%         22 ±  0%  nepim/300s-25%-udp6/lkp-hsw01
       121 ±  3%      -3.7%        117 ±  1%  netpipe/performance-tcp/ivb42
         2 ±  0%      +0.0%          2 ±  0%  nuttcp/300s/lkp-hsw01
      2973 ±  0%      -0.6%       2956 ±  0%  pbzip2/performance-100%-500K/ivb42
      2970 ±  0%      -1.2%       2933 ±  0%  pbzip2/performance-100%-900K/ivb42
       755 ±  0%      +0.3%        757 ±  0%  pbzip2/performance-25%-500K/ivb42
       754 ±  0%      +0.1%        755 ±  0%  pbzip2/performance-25%-900K/ivb42
      1280 ±  0%      -2.3%       1250 ±  0%  pft/performance-disable_mtrr_trim-20x/loslunas
      3105 ±  0%      -0.4%       3093 ±  0%  pigz/performance-100%-128K/xps2
      3105 ±  0%      -0.6%       3087 ±  0%  pigz/performance-100%-512K/xps2
      2349 ±  0%      -1.2%       2320 ±  0%  pixz/performance-100%/lkp-wsx02
      2710 ±  0%      -1.0%       2684 ±  0%  plzip/performance-100%/lkp-sbx04
       742 ±  0%      -0.4%        739 ±  0%  plzip/performance-25%/lkp-sbx04
       624 ±  0%      +0.0%        624 ±  0%  tcrypt/performance-2s-200-204/avoton1
       624 ±  0%      +0.0%        624 ±  0%  tcrypt/performance-2s-205-210/avoton1
       623 ±  0%      +0.3%        625 ±  0%  tcrypt/performance-2s-301-319/avoton1
       623 ±  0%      +0.2%        624 ±  0%  tcrypt/performance-2s-401-417/avoton1
       623 ±  0%      +0.2%        624 ±  0%  tcrypt/performance-2s-500-504/avoton1
       302 ±  0%     +19.3%        361 ±  8%  tlbflush/performance-disable_mtrr_trim-200%-32x-512/loslunas
       934 ± 22%     +22.0%       1140 ±  0%  unixbench/performance-disable_mtrr_trim-context1/loslunas
      1061 ±  0%      -0.4%       1057 ±  0%  unixbench/performance-disable_mtrr_trim-dhry2reg/loslunas
      1121 ±  0%     -48.9%        573 ±  1%  unixbench/performance-disable_mtrr_trim-execl/loslunas
      1065 ±  0%      -0.8%       1057 ±  0%  unixbench/performance-disable_mtrr_trim-pipe/loslunas
      1272 ±  0%     -38.4%        783 ±  1%  unixbench/performance-disable_mtrr_trim-shell1/loslunas
      1613 ±  0%     -13.0%       1403 ±  1%  unixbench/performance-disable_mtrr_trim-shell8/loslunas
      1101 ±  0%     -55.7%        487 ±  0%  unixbench/performance-disable_mtrr_trim-spawn/loslunas
      1062 ±  0%      -0.5%       1057 ±  0%  unixbench/performance-disable_mtrr_trim-syscall/loslunas
      1119 ±  0%      -1.0%       1108 ±  0%  unixbench/performance-disable_mtrr_trim-whetstone-double/loslunas
        52 ±  6%      -2.9%         51 ±  3%  vm-scalability/performance-300s-128G-truncate/lkp-hsx03
      1371 ±  0%     -17.0%       1137 ±  0%  vm-scalability/performance-300s-16G-shm-pread-rand-mt/lkp-hsx03
      2582 ±  0%      -0.7%       2565 ±  0%  vm-scalability/performance-300s-16G-shm-pread-rand/lkp-hsx03
      1692 ±  1%     -17.6%       1395 ±  0%  vm-scalability/performance-300s-16G-shm-xread-rand-mt/lkp-hsx03
      2582 ±  0%      -0.9%       2559 ±  0%  vm-scalability/performance-300s-16G-shm-xread-rand/lkp-hsx03
       754 ±  0%      -8.3%        691 ±  0%  vm-scalability/performance-300s-1T-msync-mt/lkp-hsx03
        75 ±  0%      +3.3%         78 ±  6%  vm-scalability/performance-300s-1T-msync/lkp-hsx03
       303 ±  1%     -25.7%        225 ±  1%  vm-scalability/performance-300s-1T-remap/lkp-hsx03
      1216 ±  8%      -6.5%       1137 ±  1%  vm-scalability/performance-300s-256G-lru-shm-rand/lkp-hsx03
      1890 ±  0%      -1.2%       1867 ±  0%  vm-scalability/performance-300s-2T-shm-pread-seq/lkp-hsx03
       599 ±  0%      -3.9%        576 ±  0%  vm-scalability/performance-300s-2T-shm-xread-seq-mt/lkp-hsx03
      1880 ±  0%      +0.3%       1885 ±  0%  vm-scalability/performance-300s-2T-shm-xread-seq/lkp-hsx03
      2503 ±  0%      -1.9%       2456 ±  0%  vm-scalability/performance-300s-512G-anon-cow-rand-mt/lkp-hsx03
      2379 ±  0%      -3.4%       2299 ±  0%  vm-scalability/performance-300s-512G-anon-cow-rand/lkp-hsx03
      1965 ±  3%     -11.9%       1731 ±  0%  vm-scalability/performance-300s-512G-anon-w-rand-mt/lkp-hsx03
      2294 ±  0%      -8.8%       2092 ±  0%  vm-scalability/performance-300s-512G-anon-w-rand/lkp-hsx03
      2471 ±  0%      -5.0%       2347 ±  3%  vm-scalability/performance-300s-512G-anon-wx-rand-mt/lkp-hsx03
      1752 ±  0%     -11.6%       1549 ±  4%  vm-scalability/performance-300s-8T-anon-cow-seq-mt/lkp-hsx03
      1595 ±  0%      +5.3%       1680 ±  0%  vm-scalability/performance-300s-8T-anon-cow-seq/lkp-hsx03
      1219 ±  0%     -85.0%        182 ±  3%  vm-scalability/performance-300s-8T-anon-w-seq-mt/lkp-hsx03
      2358 ±  0%     -23.9%       1795 ±  0%  vm-scalability/performance-300s-8T-anon-w-seq/lkp-hsx03
      2060 ±  2%     -14.9%       1753 ±  1%  vm-scalability/performance-300s-8T-anon-wx-seq-mt/lkp-hsx03
      2569 ±  0%      -0.4%       2558 ±  0%  vm-scalability/performance-300s-anon-r-rand-mt/lkp-hsx03
      2515 ±  1%      +3.3%       2598 ±  0%  vm-scalability/performance-300s-anon-r-rand/lkp-hsx03
      2646 ±  0%      -1.5%       2607 ±  0%  vm-scalability/performance-300s-anon-r-seq/lkp-hsx03
        22 ±  0%      +0.0%         22 ±  0%  vm-scalability/performance-300s-migrate/lkp-hsx03
      2650 ±  0%      -0.3%       2642 ±  0%  vm-scalability/performance-300s-mmap-pread-rand/lkp-hsx03
      1738 ±  3%     +11.4%       1936 ±  0%  vm-scalability/performance-300s-mmap-pread-seq-mt/lkp-hsx03
      2463 ±  0%      -0.0%       2462 ±  0%  vm-scalability/performance-300s-mmap-xread-rand-mt/lkp-hsx03
       143 ±  0%     +35.9%        195 ±  0%  vm-scalability/performance-300s-small-allocs-mt/lkp-hsx03
       158 ±  0%     +16.4%        184 ±  0%  vm-scalability/performance-300s-small-allocs/lkp-hsx03
       620 ±  0%      -0.4%        618 ±  0%  will-it-scale/performance-page_fault1/brickland1
       513 ±  0%      -0.1%        513 ±  0%  will-it-scale/performance-pread1/brickland1
       495 ±  0%      -0.2%        494 ±  0%  will-it-scale/performance-pwrite3/brickland1
       513 ±  0%      +0.0%        513 ±  0%  will-it-scale/performance-sched_yield/brickland1
       513 ±  0%      -0.3%        512 ±  0%  will-it-scale/performance-write1/brickland1
       513 ±  0%      -0.1%        513 ±  0%  will-it-scale/performance-writeseek1/brickland1
       346 ±  0%      +2.5%        355 ±  0%  will-it-scale/powersave-context_switch1/brickland1
       268 ±  0%      -0.2%        267 ±  0%  will-it-scale/powersave-eventfd1/brickland1
       268 ±  0%      -0.2%        267 ±  0%  will-it-scale/powersave-lock1/brickland1
       279 ±  0%      -8.9%        254 ±  0%  will-it-scale/powersave-open1/brickland1
       218 ±  0%      +0.0%        218 ±  0%  will-it-scale/powersave-page_fault3/brickland1
       268 ±  0%      -0.2%        267 ±  0%  will-it-scale/powersave-pread1/brickland1
       267 ±  0%      -0.2%        267 ±  0%  will-it-scale/powersave-read1/brickland1
      1313            -2.9%       1275        GEO-MEAN turbostat.Avg_MHz

       base       4c59e142fee20a57617cff2250  
----------------  --------------------------  
    348.37 ±  0%     -34.9%     226.88 ±  0%  aim7/performance-2000-fork_test/brickland3
    490.61 ±  0%      +0.1%     490.94 ±  0%  aim7/performance-3000-brk_test/brickland3
    328.73 ±  5%      +0.8%     331.28 ±  2%  aim7/performance-3000-disk_cp/brickland3
    324.27 ±  4%      -8.4%     297.09 ±  0%  aim7/performance-3000-disk_rd/brickland3
    524.53 ±  0%      +1.3%     531.41 ±  0%  aim7/performance-3000-sieve/brickland3
    402.66 ±  1%     -17.6%     331.60 ±  0%  aim7/performance-8000-disk_src/brickland3
    279.63 ±  7%      -4.4%     267.26 ±  2%  aim7/performance-8000-disk_wrt/brickland3
    401.08 ±  0%      -0.9%     397.33 ±  0%  aim7/performance-8000-mem_rtns_1/brickland3
    187.70 ±  0%      +8.4%     203.38 ±  1%  aim7/performance-8000-misc_rtns_1/brickland3
    383.44 ±  0%      -0.3%     382.28 ±  0%  aim7/performance-8000-sort_rtns_1/brickland3
    418.93 ±  0%      -1.3%     413.51 ±  0%  aim7/performance-8000-string_rtns/brickland3
     82.38 ±  2%      +0.4%      82.74 ±  2%  fsmark/performance-1x-1t-1BRD_32G-ext4-4K-4G-fsyncBeforeClose-1fpd/ivb43
     84.75 ±  2%      -2.6%      82.55 ±  0%  fsmark/performance-1x-1t-1BRD_32G-ext4-nfsv4-4K-4G-fsyncBeforeClose-1fpd/ivb43
     82.63 ±  0%      -1.1%      81.77 ±  1%  fsmark/performance-1x-1t-1BRD_32G-xfs-4K-4G-fsyncBeforeClose-1fpd/ivb43
     83.98 ±  2%      -2.0%      82.27 ±  2%  fsmark/performance-1x-1t-1BRD_32G-xfs-nfsv4-4K-4G-fsyncBeforeClose-1fpd/ivb43
     27.46 ±  0%      +0.9%      27.71 ±  0%  fwq/performance-disable_mtrr_trim-100%-20x-100000ss/loslunas
    195.25 ±  0%      -1.1%     193.04 ±  3%  hackbench/1600%-process-pipe/lkp-sb03
    197.75 ±  0%      -0.6%     196.60 ±  0%  hackbench/1600%-process-socket/lkp-sb03
    202.54 ±  0%      +0.8%     204.17 ±  0%  hackbench/1600%-threads-pipe/lkp-sb03
    205.00 ±  0%      -0.2%     204.51 ±  0%  hackbench/1600%-threads-socket/lkp-sb03
    194.08 ±  0%      +1.0%     195.95 ±  0%  hackbench/50%-process-pipe/lkp-sb03
    194.97 ±  0%      +1.5%     197.95 ±  0%  hackbench/50%-process-socket/lkp-sb03
    194.39 ±  0%      +1.1%     196.51 ±  0%  hackbench/50%-threads-pipe/lkp-sb03
    196.58 ±  0%      +1.4%     199.27 ±  0%  hackbench/50%-threads-socket/lkp-sb03
      8.78 ±  0%      +0.4%       8.82 ±  0%  hackbench/performance-1600%-process-pipe/avoton3
    468.92 ±  2%      -2.0%     459.50 ±  2%  hackbench/performance-1600%-process-pipe/brickland3
    184.17 ±  0%      +0.5%     185.03 ±  0%  hackbench/performance-1600%-process-pipe/grantley
    194.08 ±  0%      -0.3%     193.58 ±  0%  hackbench/performance-1600%-process-pipe/ivb41
    197.10 ±  0%      -0.3%     196.56 ±  0%  hackbench/performance-1600%-process-pipe/ivb42
     91.52 ±  0%      -0.2%      91.36 ±  0%  hackbench/performance-1600%-process-pipe/lituya
      9.21 ±  0%      +0.9%       9.30 ±  0%  hackbench/performance-1600%-process-pipe/lkp-bdw02
      6.92 ±  0%      +0.2%       6.93 ±  0%  hackbench/performance-1600%-process-pipe/lkp-bdw03
    273.27 ±  0%      -0.4%     272.06 ±  0%  hackbench/performance-1600%-process-pipe/lkp-hsw01
    533.55 ±  4%      -6.9%     496.78 ±  3%  hackbench/performance-1600%-process-pipe/lkp-hsx02
    335.59 ±  0%      +0.8%     338.37 ±  0%  hackbench/performance-1600%-process-pipe/lkp-sbx04
      8.78 ±  0%      +0.0%       8.78 ±  0%  hackbench/performance-1600%-process-socket/avoton3
    188.46 ±  0%      -1.3%     185.92 ±  1%  hackbench/performance-1600%-process-socket/grantley
    190.81 ±  0%      -0.2%     190.51        hackbench/performance-1600%-process-socket/ivb41
    193.78 ±  0%      -0.5%     192.84 ±  0%  hackbench/performance-1600%-process-socket/ivb42
     90.81 ±  0%      -0.4%      90.41 ±  0%  hackbench/performance-1600%-process-socket/lituya
     10.07 ±  0%      +0.2%      10.09 ±  0%  hackbench/performance-1600%-process-socket/lkp-bdw01
      9.81 ±  0%      -0.2%       9.78 ±  0%  hackbench/performance-1600%-process-socket/lkp-bdw02
      7.14 ±  0%      -0.3%       7.12 ±  0%  hackbench/performance-1600%-process-socket/lkp-bdw03
      8.83 ±  0%      +0.1%       8.84 ±  0%  hackbench/performance-1600%-threads-pipe/avoton3
     92.91 ±  0%      -0.1%      92.84 ±  0%  hackbench/performance-1600%-threads-pipe/lituya
      9.57 ±  0%      +1.0%       9.67 ±  0%  hackbench/performance-1600%-threads-pipe/lkp-bdw01
      9.28 ±  0%      +0.9%       9.36 ±  0%  hackbench/performance-1600%-threads-pipe/lkp-bdw02
      6.95 ±  0%      +0.6%       6.99 ±  0%  hackbench/performance-1600%-threads-pipe/lkp-bdw03
    547.48 ±  2%      +3.3%     565.41 ±  0%  hackbench/performance-1600%-threads-pipe/lkp-hsx03
      8.78 ±  0%      +0.1%       8.79 ±  0%  hackbench/performance-1600%-threads-socket/avoton3
    198.22 ±  1%      +0.1%     198.36 ±  0%  hackbench/performance-1600%-threads-socket/ivb42
     92.32 ±  0%      -2.0%      90.45 ±  1%  hackbench/performance-1600%-threads-socket/lituya
     10.15 ±  0%      -0.4%      10.11 ±  0%  hackbench/performance-1600%-threads-socket/lkp-bdw01
      9.84 ±  0%      -0.3%       9.81 ±  0%  hackbench/performance-1600%-threads-socket/lkp-bdw02
     11.85 ± 32%     -13.1%      10.30 ± 37%  hackbench/performance-1600%-threads-socket/lkp-bdw03
    509.45 ±  0%      +0.3%     510.89 ±  0%  hackbench/performance-50%-process-pipe/brickland3
    180.62 ±  0%      -0.1%     180.51 ±  0%  hackbench/performance-50%-process-pipe/grantley
    198.47 ±  0%      +0.6%     199.66 ±  0%  hackbench/performance-50%-process-pipe/ivb42
     87.70 ±  0%      +1.2%      88.76 ±  0%  hackbench/performance-50%-process-pipe/lituya
      9.16 ±  0%      +1.4%       9.29 ±  0%  hackbench/performance-50%-process-pipe/lkp-bdw01
      8.87 ±  0%      +1.4%       8.99 ±  0%  hackbench/performance-50%-process-pipe/lkp-bdw02
      6.76 ±  0%     +24.6%       8.43 ± 38%  hackbench/performance-50%-process-pipe/lkp-bdw03
    271.56 ±  0%      +0.9%     274.00 ±  0%  hackbench/performance-50%-process-pipe/lkp-hsw01
    551.67 ±  0%      +0.1%     552.06 ±  0%  hackbench/performance-50%-process-pipe/lkp-hsx03
    512.93 ±  0%      +0.3%     514.24 ±  0%  hackbench/performance-50%-process-socket/brickland3
    185.50 ±  0%      +0.4%     186.32 ±  0%  hackbench/performance-50%-process-socket/grantley
    201.22 ±  0%      +1.1%     203.36 ±  0%  hackbench/performance-50%-process-socket/ivb42
     88.55 ±  0%      -0.2%      88.41 ±  1%  hackbench/performance-50%-process-socket/lituya
      9.37 ±  0%      +1.0%       9.46 ±  0%  hackbench/performance-50%-process-socket/lkp-bdw01
      9.08 ±  0%      +0.8%       9.15 ±  0%  hackbench/performance-50%-process-socket/lkp-bdw02
      6.83 ±  0%      +0.8%       6.89 ±  0%  hackbench/performance-50%-process-socket/lkp-bdw03
    279.26 ±  0%      +0.8%     281.62 ±  0%  hackbench/performance-50%-process-socket/lkp-hsw01
    574.24 ±  0%      +0.1%     574.58 ±  0%  hackbench/performance-50%-process-socket/lkp-hsx02
    555.31 ±  0%      +0.8%     559.61 ±  0%  hackbench/performance-50%-process-socket/lkp-hsx03
    341.21 ±  0%      +0.7%     343.55 ±  0%  hackbench/performance-50%-process-socket/lkp-sbx04
    511.01 ±  0%      +0.0%     511.16 ±  0%  hackbench/performance-50%-threads-pipe/brickland3
    181.11 ±  0%      -0.3%     180.52 ±  0%  hackbench/performance-50%-threads-pipe/grantley
    198.68 ±  0%      +0.6%     199.96 ±  0%  hackbench/performance-50%-threads-pipe/ivb42
     88.72 ±  0%      -4.7%      84.57 ±  7%  hackbench/performance-50%-threads-pipe/lituya
      8.94 ±  0%      +1.4%       9.06 ±  0%  hackbench/performance-50%-threads-pipe/lkp-bdw02
      6.80 ±  0%      -0.5%       6.77 ±  2%  hackbench/performance-50%-threads-pipe/lkp-bdw03
    272.40 ±  0%      +0.3%     273.15 ±  0%  hackbench/performance-50%-threads-pipe/lkp-hsw01
    567.55 ±  0%      +0.1%     568.07 ±  0%  hackbench/performance-50%-threads-pipe/lkp-hsx02
    340.08 ±  0%      +0.5%     341.73 ±  0%  hackbench/performance-50%-threads-pipe/lkp-sbx04
    513.56 ±  0%      +0.2%     514.69 ±  0%  hackbench/performance-50%-threads-socket/brickland3
    185.94 ±  0%      -0.0%     185.93 ±  0%  hackbench/performance-50%-threads-socket/grantley
    201.50 ±  0%      +1.0%     203.49 ±  0%  hackbench/performance-50%-threads-socket/ivb42
     89.23 ±  0%      +1.1%      90.20 ±  0%  hackbench/performance-50%-threads-socket/lituya
      9.39 ±  0%      +1.3%       9.52 ±  0%  hackbench/performance-50%-threads-socket/lkp-bdw01
      9.11 ±  0%      +1.0%       9.21 ±  0%  hackbench/performance-50%-threads-socket/lkp-bdw02
      6.85 ±  0%      +1.0%       6.92 ±  0%  hackbench/performance-50%-threads-socket/lkp-bdw03
    342.90 ±  0%      +0.3%     343.83 ±  0%  hackbench/performance-50%-threads-socket/lkp-sbx04
     20.05 ±  0%      +0.9%      20.23 ±  0%  hackbench/performance-disable_mtrr_trim-1600%-process-pipe/loslunas
    183.31 ±  0%      -1.8%     180.03 ±  2%  hackbench/powersave-1600%-process-pipe/grantley
     90.24 ±  0%      +0.9%      91.05 ±  0%  hackbench/powersave-1600%-process-pipe/lituya
    181.51 ±  1%      -2.5%     176.95 ±  1%  hackbench/powersave-1600%-process-socket/grantley
     89.72 ±  0%      +0.3%      89.97 ±  0%  hackbench/powersave-1600%-process-socket/lituya
    197.02 ±  0%      -1.0%     195.05 ±  0%  hackbench/powersave-1600%-threads-pipe/grantley
     92.52 ±  0%      +0.1%      92.65 ±  0%  hackbench/powersave-1600%-threads-pipe/lituya
     91.48 ±  0%      +0.0%      91.50 ±  0%  hackbench/powersave-1600%-threads-socket/lituya
    180.65 ±  0%      -0.1%     180.50 ±  0%  hackbench/powersave-50%-process-pipe/grantley
     86.96 ±  0%      +2.1%      88.79 ±  0%  hackbench/powersave-50%-process-pipe/lituya
    185.08 ±  0%      +0.8%     186.63 ±  0%  hackbench/powersave-50%-process-socket/grantley
     87.80 ±  0%      +1.7%      89.26 ±  0%  hackbench/powersave-50%-process-socket/lituya
    180.93 ±  0%      -0.1%     180.76 ±  0%  hackbench/powersave-50%-threads-pipe/grantley
     88.23 ±  0%      +1.4%      89.49 ±  0%  hackbench/powersave-50%-threads-pipe/lituya
    185.72 ±  0%      +0.4%     186.44 ±  0%  hackbench/powersave-50%-threads-socket/grantley
     88.51 ±  0%      +1.9%      90.19 ±  0%  hackbench/powersave-50%-threads-socket/lituya
    599.02 ±  0%      -0.9%     593.70 ±  0%  hpcc/performance/lkp-hsx03
    353.53 ±  0%      -0.3%     352.54 ±  0%  hpcc/performance/lkp-sbx04
    150.32 ±  0%      +1.2%     152.13 ±  0%  kbuild/performance-200%/grantley
    144.44 ±  0%      -3.0%     140.10 ±  0%  kbuild/performance-50%/grantley
    136.95 ±  0%      -1.1%     135.49 ±  0%  kbuild/powersave-200%/grantley
    131.02 ±  0%      -4.3%     125.34 ±  0%  kbuild/powersave-50%/grantley
    229.66 ±  0%      -0.1%     229.52 ±  0%  linpack/performance/ivb42
     50.85 ±  0%      +4.0%      52.88 ±  2%  nepim/300s-100%-tcp/lkp-hsw01
     28.77           +13.3%      32.59 ± 10%  nepim/300s-100%-tcp6/lkp-hsw01
     33.44 ±  1%      -3.3%      32.33 ±  2%  nepim/300s-100%-udp/lkp-hsw01
     64.34 ±  0%      -0.3%      64.12 ±  0%  nepim/300s-100%-udp6/lkp-hsw01
     47.48 ±  0%      +1.1%      48.03 ±  1%  nepim/300s-25%-tcp/lkp-hsw01
     27.41 ±  0%      +1.6%      27.86 ±  1%  nepim/300s-25%-tcp6/lkp-hsw01
     29.88 ±  5%      -8.3%      27.41 ±  0%  nepim/300s-25%-udp/lkp-hsw01
     39.45 ±  0%      +1.5%      40.05 ±  0%  nepim/300s-25%-udp6/lkp-hsw01
     91.54 ±  0%      -9.6%      82.80 ±  1%  netpipe/performance-tcp/ivb42
     25.90 ±  2%      -3.4%      25.02 ±  1%  nuttcp/300s/lkp-hsw01
    204.05 ±  0%      -0.5%     203.13 ±  0%  pbzip2/performance-100%-500K/ivb42
    192.82 ±  0%      -0.0%     192.76 ±  0%  pbzip2/performance-100%-900K/ivb42
    130.53 ±  0%      +0.6%     131.36 ±  0%  pbzip2/performance-25%-500K/ivb42
    129.71 ±  0%      +0.3%     130.10 ±  0%  pbzip2/performance-25%-900K/ivb42
     17.83 ±  0%      -1.4%      17.57 ±  0%  pft/performance-disable_mtrr_trim-20x/loslunas
    304.06 ±  0%      -0.3%     303.10 ±  0%  plzip/performance-100%/lkp-sbx04
    221.00 ±  0%      +0.0%     221.01 ±  0%  plzip/performance-25%/lkp-sbx04
      7.51 ±  3%      +4.0%       7.81 ±  0%  tcrypt/performance-2s-200-204/avoton1
      7.81 ±  0%      -2.5%       7.62 ±  2%  tcrypt/performance-2s-205-210/avoton1
      7.42 ±  0%      +5.3%       7.81 ±  0%  tcrypt/performance-2s-301-319/avoton1
      7.81 ±  0%      +0.0%       7.81 ±  0%  tcrypt/performance-2s-401-417/avoton1
      7.43 ±  1%      +3.8%       7.71 ±  1%  tcrypt/performance-2s-500-504/avoton1
     10.28 ±  0%      +5.5%      10.85 ±  4%  tlbflush/performance-disable_mtrr_trim-200%-32x-512/loslunas
     11.78 ± 16%     +15.7%      13.62 ±  0%  unixbench/performance-disable_mtrr_trim-context1/loslunas
     14.89 ±  4%      -0.7%      14.79 ±  5%  unixbench/performance-disable_mtrr_trim-dhry2reg/loslunas
     14.68 ±  0%     -17.7%      12.09 ±  0%  unixbench/performance-disable_mtrr_trim-execl/loslunas
     12.91 ±  0%      -0.1%      12.89 ±  0%  unixbench/performance-disable_mtrr_trim-pipe/loslunas
     15.38 ±  1%     -10.7%      13.74 ±  0%  unixbench/performance-disable_mtrr_trim-shell1/loslunas
     17.61 ±  1%      -5.1%      16.71 ±  0%  unixbench/performance-disable_mtrr_trim-shell8/loslunas
     15.03 ±  0%     -20.5%      11.96 ±  1%  unixbench/performance-disable_mtrr_trim-spawn/loslunas
     13.70 ±  0%      +0.3%      13.75 ±  0%  unixbench/performance-disable_mtrr_trim-syscall/loslunas
     14.25 ±  0%      -0.6%      14.17 ±  0%  unixbench/performance-disable_mtrr_trim-whetstone-double/loslunas
    134.21 ±  3%      +0.7%     135.10 ±  2%  vm-scalability/performance-300s-128G-truncate/lkp-hsx03
    309.16 ±  0%      -9.2%     280.68 ±  0%  vm-scalability/performance-300s-16G-shm-pread-rand-mt/lkp-hsx03
    494.31 ±  0%      -0.4%     492.40 ±  0%  vm-scalability/performance-300s-16G-shm-pread-rand/lkp-hsx03
    479.29 ±  0%      -3.5%     462.40 ±  0%  vm-scalability/performance-300s-16G-shm-xread-rand-mt/lkp-hsx03
    493.82 ±  0%      -0.3%     492.12 ±  0%  vm-scalability/performance-300s-16G-shm-xread-rand/lkp-hsx03
    416.69 ±  0%      -2.9%     404.75 ±  0%  vm-scalability/performance-300s-1T-msync-mt/lkp-hsx03
    180.83 ±  0%      -2.4%     176.58 ±  0%  vm-scalability/performance-300s-1T-msync/lkp-hsx03
    231.39 ±  0%      -9.7%     208.97 ±  0%  vm-scalability/performance-300s-1T-remap/lkp-hsx03
    328.54 ±  0%      +0.8%     331.06 ±  0%  vm-scalability/performance-300s-256G-lru-shm-rand/lkp-hsx03
    522.23 ±  0%      +0.1%     522.75 ±  0%  vm-scalability/performance-300s-2T-shm-pread-seq/lkp-hsx03
    426.27 ±  0%      +0.3%     427.45 ±  0%  vm-scalability/performance-300s-2T-shm-xread-seq-mt/lkp-hsx03
    522.12 ±  0%      +0.3%     523.54 ±  0%  vm-scalability/performance-300s-2T-shm-xread-seq/lkp-hsx03
    486.91 ±  0%      +0.5%     489.11 ±  0%  vm-scalability/performance-300s-512G-anon-cow-rand-mt/lkp-hsx03
    491.40 ±  0%      -2.6%     478.45 ±  0%  vm-scalability/performance-300s-512G-anon-cow-rand/lkp-hsx03
    434.11 ±  2%      -7.9%     399.68 ±  0%  vm-scalability/performance-300s-512G-anon-w-rand-mt/lkp-hsx03
    469.13 ±  0%      -2.4%     457.64 ±  0%  vm-scalability/performance-300s-512G-anon-w-rand/lkp-hsx03
    487.12 ±  0%      -2.7%     474.11 ±  3%  vm-scalability/performance-300s-512G-anon-wx-rand-mt/lkp-hsx03
    483.83 ±  0%      -9.0%     440.18 ±  4%  vm-scalability/performance-300s-8T-anon-cow-seq-mt/lkp-hsx03
    482.56 ±  0%      -0.4%     480.49 ±  0%  vm-scalability/performance-300s-8T-anon-cow-seq/lkp-hsx03
    394.29 ±  0%     -57.0%     169.62 ±  1%  vm-scalability/performance-300s-8T-anon-w-seq-mt/lkp-hsx03
    562.12 ±  0%     -11.9%     495.23 ±  0%  vm-scalability/performance-300s-8T-anon-w-seq/lkp-hsx03
    558.20 ±  2%      -9.5%     505.15 ±  0%  vm-scalability/performance-300s-8T-anon-wx-seq-mt/lkp-hsx03
    576.78 ±  0%      -0.2%     575.63 ±  0%  vm-scalability/performance-300s-anon-r-rand-mt/lkp-hsx03
    569.27 ±  1%      +2.4%     582.94 ±  0%  vm-scalability/performance-300s-anon-r-rand/lkp-hsx03
    624.65 ±  0%      -0.4%     622.18 ±  0%  vm-scalability/performance-300s-anon-r-seq/lkp-hsx03
    133.98 ±  0%      -1.7%     131.69 ±  0%  vm-scalability/performance-300s-migrate/lkp-hsx03
    500.56 ±  0%      -0.2%     499.59 ±  0%  vm-scalability/performance-300s-mmap-pread-rand/lkp-hsx03
    412.58 ±  0%      +6.3%     438.69 ±  0%  vm-scalability/performance-300s-mmap-pread-seq-mt/lkp-hsx03
    496.84 ±  0%      -0.1%     496.38 ±  0%  vm-scalability/performance-300s-mmap-xread-rand-mt/lkp-hsx03
    231.30 ±  0%     -12.5%     202.44 ±  0%  vm-scalability/performance-300s-small-allocs-mt/lkp-hsx03
    351.32 ±  0%     -31.1%     242.24 ±  1%  vm-scalability/performance-300s-small-allocs/lkp-hsx03
    376.15 ±  0%      -0.0%     376.14 ±  0%  will-it-scale/performance-page_fault1/brickland1
    895.47 ±  1%      -0.8%     888.66 ±  3%  will-it-scale/performance-pread1/brickland1
    700.12 ±  0%      -1.4%     690.16 ±  0%  will-it-scale/performance-pwrite3/brickland1
    837.10 ±  0%      +3.2%     864.23 ±  0%  will-it-scale/performance-sched_yield/brickland1
    814.01 ±  0%      +0.3%     816.47 ±  0%  will-it-scale/performance-write1/brickland1
    785.55 ±  0%      -0.2%     784.11 ±  0%  will-it-scale/performance-writeseek1/brickland1
    353.64 ±  0%      -2.1%     346.19 ±  0%  will-it-scale/powersave-context_switch1/brickland1
    529.44 ±  0%      +0.7%     533.01 ±  0%  will-it-scale/powersave-eventfd1/brickland1
    622.96 ±  0%     -29.3%     440.47 ±  0%  will-it-scale/powersave-lock1/brickland1
    466.97 ±  0%      -0.6%     464.17 ±  0%  will-it-scale/powersave-open1/brickland1
    281.06 ±  0%      +0.0%     281.06 ±  0%  will-it-scale/powersave-page_fault3/brickland1
    507.15 ±  3%      -0.6%     504.26 ±  0%  will-it-scale/powersave-pread1/brickland1
    526.77 ±  0%      +0.2%     527.88 ±  0%  will-it-scale/powersave-read1/brickland1
    112.76            -1.7%     110.81        GEO-MEAN turbostat.PkgWatt

       base       4c59e142fee20a57617cff2250  
----------------  --------------------------  
     96.17 ±  0%     -24.5%      72.62 ±  0%  aim7/performance-2000-fork_test/brickland3
     62.39 ±  0%      +0.4%      62.67 ±  0%  aim7/performance-3000-brk_test/brickland3
     62.52 ±  1%      +2.1%      63.80 ±  0%  aim7/performance-3000-disk_cp/brickland3
     63.06 ±  0%      -0.8%      62.58 ±  0%  aim7/performance-3000-disk_rd/brickland3
    140.71 ±  0%      +0.1%     140.85 ±  0%  aim7/performance-3000-sieve/brickland3
     66.52 ±  0%      -0.6%      66.13 ±  1%  aim7/performance-8000-disk_src/brickland3
     56.52 ±  2%      -3.8%      54.35 ±  3%  aim7/performance-8000-disk_wrt/brickland3
     53.45 ±  0%      +1.9%      54.48 ±  0%  aim7/performance-8000-mem_rtns_1/brickland3
     57.40 ±  1%      -2.6%      55.89 ±  0%  aim7/performance-8000-misc_rtns_1/brickland3
     52.95 ±  0%      +1.8%      53.88 ±  0%  aim7/performance-8000-sort_rtns_1/brickland3
     56.38 ±  0%      -0.3%      56.21 ±  0%  aim7/performance-8000-string_rtns/brickland3
      5.21 ±  4%      -5.6%       4.92 ±  6%  fsmark/performance-1x-1t-1BRD_32G-ext4-4K-4G-fsyncBeforeClose-1fpd/ivb43
      5.28 ±  0%      -6.8%       4.92 ±  8%  fsmark/performance-1x-1t-1BRD_32G-ext4-nfsv4-4K-4G-fsyncBeforeClose-1fpd/ivb43
      4.29 ±  0%      -2.6%       4.18 ±  1%  fsmark/performance-1x-1t-1BRD_32G-xfs-4K-4G-fsyncBeforeClose-1fpd/ivb43
      4.44 ±  1%      -4.8%       4.22 ±  2%  fsmark/performance-1x-1t-1BRD_32G-xfs-nfsv4-4K-4G-fsyncBeforeClose-1fpd/ivb43
     71.08 ±  2%      -1.0%      70.38 ±  1%  hackbench/performance-1600%-process-pipe/brickland3
     17.81 ±  2%      -6.8%      16.60 ±  1%  hackbench/performance-1600%-process-pipe/grantley
     10.59 ±  0%      -9.1%       9.62 ±  0%  hackbench/performance-1600%-process-pipe/ivb41
      9.69 ±  0%      -8.5%       8.86 ±  2%  hackbench/performance-1600%-process-pipe/ivb42
      0.65 ± 10%    +110.4%       1.38 ±  9%  hackbench/performance-1600%-process-pipe/lituya
     36.82 ±  0%      -3.9%      35.40 ±  1%  hackbench/performance-1600%-process-pipe/lkp-hsw01
     47.41 ±  0%      +0.0%      47.43 ±  0%  hackbench/performance-1600%-process-pipe/lkp-hsx02
     33.69 ±  0%      -1.9%      33.05 ±  3%  hackbench/performance-1600%-process-socket/grantley
     18.05 ±  0%      +0.2%      18.09        hackbench/performance-1600%-process-socket/ivb41
     17.52 ±  0%      -1.7%      17.23 ±  0%  hackbench/performance-1600%-process-socket/ivb42
      8.72 ±  1%      -2.6%       8.49 ±  2%  hackbench/performance-1600%-process-socket/lituya
      0.49 ± 19%    +103.7%       0.99 ± 10%  hackbench/performance-1600%-threads-pipe/lituya
     54.43 ±  0%      +0.2%      54.54 ±  0%  hackbench/performance-1600%-threads-pipe/lkp-hsx03
     18.24 ±  1%      -1.0%      18.06 ±  1%  hackbench/performance-1600%-threads-socket/ivb42
      8.14 ±  1%      +2.4%       8.34 ±  1%  hackbench/performance-1600%-threads-socket/lituya
     59.67 ±  1%      +4.8%      62.52 ±  1%  hackbench/performance-50%-process-pipe/brickland3
     12.55 ±  2%      -0.0%      12.55 ±  2%  hackbench/performance-50%-process-pipe/grantley
      5.90 ±  1%     +10.2%       6.51 ±  4%  hackbench/performance-50%-process-pipe/ivb42
     30.44 ±  0%      -2.2%      29.76 ±  1%  hackbench/performance-50%-process-pipe/lkp-hsw01
     54.35 ±  0%      +0.0%      54.35 ±  0%  hackbench/performance-50%-process-pipe/lkp-hsx03
     62.44 ±  2%      +7.5%      67.15 ±  1%  hackbench/performance-50%-process-socket/brickland3
     12.42 ±  3%     +10.9%      13.77 ±  2%  hackbench/performance-50%-process-socket/grantley
      6.49 ±  4%     +14.9%       7.45 ±  1%  hackbench/performance-50%-process-socket/ivb42
     30.52 ±  0%      +2.4%      31.24 ±  1%  hackbench/performance-50%-process-socket/lkp-hsw01
     47.25 ±  0%      -0.1%      47.20 ±  0%  hackbench/performance-50%-process-socket/lkp-hsx02
     54.36 ±  0%      -0.0%      54.33 ±  0%  hackbench/performance-50%-process-socket/lkp-hsx03
     59.82 ±  3%      +6.3%      63.60 ±  1%  hackbench/performance-50%-threads-pipe/brickland3
     12.25 ±  2%      -0.9%      12.14 ±  3%  hackbench/performance-50%-threads-pipe/grantley
      5.74 ±  2%      +9.6%       6.29 ±  1%  hackbench/performance-50%-threads-pipe/ivb42
     30.06 ±  1%      -0.8%      29.80 ±  1%  hackbench/performance-50%-threads-pipe/lkp-hsw01
     47.21 ±  0%      +0.1%      47.24 ±  0%  hackbench/performance-50%-threads-pipe/lkp-hsx02
     62.52 ±  0%     +12.1%      70.09 ±  2%  hackbench/performance-50%-threads-socket/brickland3
     11.96 ±  2%     +11.4%      13.32 ±  1%  hackbench/performance-50%-threads-socket/grantley
      6.05 ±  6%     +23.4%       7.47 ±  5%  hackbench/performance-50%-threads-socket/ivb42
     17.82 ±  3%      -4.7%      16.99 ±  2%  hackbench/powersave-1600%-process-pipe/grantley
      0.62 ±  5%     +95.8%       1.22 ±  1%  hackbench/powersave-1600%-process-pipe/lituya
     32.71 ±  2%      +0.3%      32.80 ±  3%  hackbench/powersave-1600%-process-socket/grantley
      8.72 ±  0%      -3.0%       8.46 ±  2%  hackbench/powersave-1600%-process-socket/lituya
     18.52 ±  2%      -7.6%      17.11 ±  1%  hackbench/powersave-1600%-threads-pipe/grantley
      0.69 ± 11%     +29.7%       0.89 ±  7%  hackbench/powersave-1600%-threads-pipe/lituya
      8.28 ±  1%      +1.1%       8.37 ±  2%  hackbench/powersave-1600%-threads-socket/lituya
     12.22 ±  4%      -0.7%      12.14 ±  1%  hackbench/powersave-50%-process-pipe/grantley
     12.03 ±  3%     +14.1%      13.72 ±  2%  hackbench/powersave-50%-process-socket/grantley
     11.48 ±  1%      +3.0%      11.83 ±  3%  hackbench/powersave-50%-threads-pipe/grantley
      0.44 ± 41%     -30.9%       0.30 ± 41%  hackbench/powersave-50%-threads-pipe/lituya
     11.87 ±  1%     +11.2%      13.20 ±  2%  hackbench/powersave-50%-threads-socket/grantley
     76.17 ±  0%      -1.3%      75.21 ±  0%  hpcc/performance/lkp-hsx03
     18.60 ±  0%      -1.0%      18.42 ±  0%  kbuild/performance-200%/grantley
     15.75 ±  1%      -1.8%      15.46 ±  0%  kbuild/performance-50%/grantley
     17.42 ±  0%      +0.9%      17.57 ±  1%  kbuild/powersave-200%/grantley
     15.68 ±  0%      -4.8%      14.92 ±  0%  kbuild/powersave-50%/grantley
     13.50 ±  0%      +0.1%      13.52 ±  0%  linpack/performance/ivb42
     16.07 ±  2%     +20.0%      19.28 ±  1%  nepim/300s-100%-tcp/lkp-hsw01
      5.89           +40.0%       8.24 ± 23%  nepim/300s-100%-tcp6/lkp-hsw01
      8.68 ±  0%      -9.0%       7.89 ±  6%  nepim/300s-100%-udp/lkp-hsw01
     15.23 ±  0%      +5.8%      16.12 ±  4%  nepim/300s-100%-udp6/lkp-hsw01
     16.20 ±  0%      +0.2%      16.24 ±  1%  nepim/300s-25%-tcp/lkp-hsw01
      5.63 ±  0%      +6.1%       5.98 ±  3%  nepim/300s-25%-tcp6/lkp-hsw01
      7.09 ± 13%     -20.2%       5.67 ±  1%  nepim/300s-25%-udp/lkp-hsw01
      9.59 ±  1%      +2.3%       9.81 ±  0%  nepim/300s-25%-udp6/lkp-hsw01
      3.75 ±  3%      +2.9%       3.86 ±  4%  netpipe/performance-tcp/ivb42
      4.86 ±  9%      -7.6%       4.49 ±  0%  nuttcp/300s/lkp-hsw01
     16.20 ±  0%      +0.4%      16.27 ±  0%  pbzip2/performance-100%-500K/ivb42
     19.20 ±  0%      -1.3%      18.95 ±  0%  pbzip2/performance-100%-900K/ivb42
      5.29 ±  0%      +0.6%       5.32 ±  0%  pbzip2/performance-25%-500K/ivb42
      6.22 ±  0%      -1.1%       6.15 ±  0%  pbzip2/performance-25%-900K/ivb42
     54.62 ±  0%      -0.1%      54.58 ±  0%  vm-scalability/performance-300s-128G-truncate/lkp-hsx03
     54.41 ±  0%      +0.0%      54.42 ±  0%  vm-scalability/performance-300s-16G-shm-pread-rand-mt/lkp-hsx03
     59.48 ±  0%      -3.4%      57.47 ±  3%  vm-scalability/performance-300s-16G-shm-pread-rand/lkp-hsx03
     54.20 ±  0%      +0.1%      54.23 ±  0%  vm-scalability/performance-300s-16G-shm-xread-rand-mt/lkp-hsx03
     59.63 ±  0%      -0.6%      59.27 ±  0%  vm-scalability/performance-300s-16G-shm-xread-rand/lkp-hsx03
     54.75 ±  0%      +0.7%      55.15 ±  0%  vm-scalability/performance-300s-1T-msync-mt/lkp-hsx03
     55.94 ±  0%      -0.1%      55.89 ±  0%  vm-scalability/performance-300s-1T-msync/lkp-hsx03
     58.91 ±  0%      -6.8%      54.92 ±  0%  vm-scalability/performance-300s-1T-remap/lkp-hsx03
     60.36 ±  0%      +6.0%      63.98 ±  0%  vm-scalability/performance-300s-256G-lru-shm-rand/lkp-hsx03
     54.59 ±  0%      +0.0%      54.59 ±  0%  vm-scalability/performance-300s-2T-shm-pread-seq/lkp-hsx03
     54.15 ±  0%      +0.1%      54.19 ±  0%  vm-scalability/performance-300s-2T-shm-xread-seq-mt/lkp-hsx03
     54.70 ±  0%      -0.2%      54.61 ±  0%  vm-scalability/performance-300s-2T-shm-xread-seq/lkp-hsx03
     54.89 ±  0%      +0.4%      55.08 ±  0%  vm-scalability/performance-300s-512G-anon-cow-rand-mt/lkp-hsx03
    118.84 ±  0%      -1.9%     116.62 ±  0%  vm-scalability/performance-300s-512G-anon-cow-rand/lkp-hsx03
    112.24 ±  1%      -8.1%     103.16 ±  0%  vm-scalability/performance-300s-512G-anon-w-rand-mt/lkp-hsx03
    123.90 ±  0%     -13.6%     107.01 ±  0%  vm-scalability/performance-300s-512G-anon-w-rand/lkp-hsx03
     54.55 ±  0%      +1.4%      55.33 ±  0%  vm-scalability/performance-300s-512G-anon-wx-rand-mt/lkp-hsx03
     81.62 ±  0%      +2.6%      83.71 ±  1%  vm-scalability/performance-300s-8T-anon-cow-seq-mt/lkp-hsx03
    137.03 ±  0%      +8.9%     149.25 ±  0%  vm-scalability/performance-300s-8T-anon-cow-seq/lkp-hsx03
    126.82 ±  0%     -50.4%      62.97 ±  0%  vm-scalability/performance-300s-8T-anon-w-seq-mt/lkp-hsx03
    190.14 ±  0%     -12.3%     166.68 ±  0%  vm-scalability/performance-300s-8T-anon-w-seq/lkp-hsx03
     91.51 ±  0%      +3.6%      94.76 ±  0%  vm-scalability/performance-300s-8T-anon-wx-seq-mt/lkp-hsx03
     54.52 ±  0%      +0.0%      54.52 ±  0%  vm-scalability/performance-300s-anon-r-rand-mt/lkp-hsx03
     54.51 ±  0%      +0.0%      54.52 ±  0%  vm-scalability/performance-300s-anon-r-rand/lkp-hsx03
     54.51 ±  0%      +0.0%      54.51 ±  0%  vm-scalability/performance-300s-anon-r-seq/lkp-hsx03
     54.50 ±  0%      +0.0%      54.50 ±  0%  vm-scalability/performance-300s-migrate/lkp-hsx03
     59.49 ±  0%      +7.5%      63.97 ±  0%  vm-scalability/performance-300s-mmap-pread-rand/lkp-hsx03
     59.04 ±  2%      +9.9%      64.88 ±  0%  vm-scalability/performance-300s-mmap-pread-seq-mt/lkp-hsx03
     54.39 ±  0%      +0.0%      54.41 ±  0%  vm-scalability/performance-300s-mmap-xread-rand-mt/lkp-hsx03
     54.33 ±  0%      +0.3%      54.50 ±  0%  vm-scalability/performance-300s-small-allocs-mt/lkp-hsx03
     54.41 ±  0%      +0.1%      54.47 ±  0%  vm-scalability/performance-300s-small-allocs/lkp-hsx03
     31.23 ±  0%      +0.0%      31.24 ±  0%  will-it-scale/performance-page_fault1/brickland1
     14.91 ±  0%      +0.0%      14.91 ±  0%  will-it-scale/performance-pread1/brickland1
     14.90 ±  0%      +0.0%      14.90 ±  0%  will-it-scale/performance-pwrite3/brickland1
     14.90 ±  0%      +0.0%      14.91 ±  0%  will-it-scale/performance-sched_yield/brickland1
     14.91 ±  0%      +0.0%      14.91 ±  0%  will-it-scale/performance-write1/brickland1
     14.91 ±  0%      +0.0%      14.91 ±  0%  will-it-scale/performance-writeseek1/brickland1
     14.89 ±  0%      +0.0%      14.89 ±  0%  will-it-scale/powersave-context_switch1/brickland1
     14.89 ±  0%      +0.0%      14.89 ±  0%  will-it-scale/powersave-eventfd1/brickland1
     14.89 ±  0%      +0.0%      14.89 ±  0%  will-it-scale/powersave-lock1/brickland1
     14.89 ±  0%      -0.0%      14.89 ±  0%  will-it-scale/powersave-open1/brickland1
     14.90 ±  0%      +0.0%      14.90 ±  0%  will-it-scale/powersave-page_fault3/brickland1
     14.89 ±  0%      +0.0%      14.89 ±  0%  will-it-scale/powersave-pread1/brickland1
     14.89 ±  0%      +0.0%      14.89 ±  0%  will-it-scale/powersave-read1/brickland1
     22.21            +1.4%      22.51        GEO-MEAN turbostat.RAMWatt


On Mon, May 25, 2015 at 09:49:43AM +0800, Yuyang Du wrote:
> Hi Peter and Ingo,
> 
> Changes are made for the 8th version:
> 
> 1) Rebase to the latest tip tree
> 2) scale_load_down the weight when doing the averages
> 3) change util_sum to u32
> 
> Thanks a lot for Ben's comments, which lead to this version.
> 
> Regards,
> Yuyang

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v8 0/4] sched: Rewrite runnable load and utilization average tracking
  2015-06-09  1:21 ` Yuyang Du
@ 2015-06-15 10:38   ` Boqun Feng
  2015-06-15 18:46     ` Yuyang Du
  0 siblings, 1 reply; 13+ messages in thread
From: Boqun Feng @ 2015-06-15 10:38 UTC (permalink / raw)
  To: Yuyang Du
  Cc: mingo, peterz, linux-kernel, pjt, bsegall, morten.rasmussen,
	vincent.guittot, dietmar.eggemann, arjan.van.de.ven, len.brown,
	rafael.j.wysocki, fengguang.wu, Srikar Dronamraju

[-- Attachment #1: Type: text/plain, Size: 878 bytes --]

Hi Yuyang,

We have tested your V7 patchset as follow:

On Intel(R) Xeon(R) CPU X5690 (12 cores), run 12 stress and 6 dbench.
Results show that usages of some CPUs are less than 50% sometimes.

We would like to test your V8 patchset, but I can neither find it in a
lkml archive, nor in my lkml subscription. Would you please send me your
V8 patchset, if convenient? Or a public git tree works too!

Regards,
Boqun

On Tue, Jun 09, 2015 at 09:21:43AM +0800, Yuyang Du wrote:
> Ping ...
> 
> Plus some data tested by LKP:
> 
> To name a few host configurations:
> 
> host: brickland3
> model: Brickland Ivy Bridge-EX
> nr_cpu: 120
> memory: 512G
> 
> host: lkp-a03
> model: Atom
> memory: 8G
> 
> host: grantley
> model: Grantley Haswell-EP
> memory: 32G
> 
> host: ivb43
> model: Ivytown Ivy Bridge-EP
> nr_cpu: 48
> memory: 64G
> 
[snip]

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v8 0/4] sched: Rewrite runnable load and utilization average tracking
  2015-06-15 10:38   ` Boqun Feng
@ 2015-06-15 18:46     ` Yuyang Du
  0 siblings, 0 replies; 13+ messages in thread
From: Yuyang Du @ 2015-06-15 18:46 UTC (permalink / raw)
  To: Boqun Feng
  Cc: mingo, peterz, linux-kernel, pjt, bsegall, morten.rasmussen,
	vincent.guittot, dietmar.eggemann, arjan.van.de.ven, len.brown,
	rafael.j.wysocki, fengguang.wu, Srikar Dronamraju

Hi Boqun,

Thanks for the tests.

Indeed I can't find the 8th versio myself in the archive. That is weird.
Vincent sure received the patches, but maybe not the list, then that is
interesting...

Anyway, I will rebase the patches up-to-date, and resend it to the list
shortly.

Regarding test results, "less than 50% sometimes" meaning imbalanced?
What about versus the stock kernel?

Regards,
Yuyang

On Mon, Jun 15, 2015 at 06:38:56PM +0800, Boqun Feng wrote:
> Hi Yuyang,
> 
> We have tested your V7 patchset as follow:
> 
> On Intel(R) Xeon(R) CPU X5690 (12 cores), run 12 stress and 6 dbench.
> Results show that usages of some CPUs are less than 50% sometimes.
> 
> We would like to test your V8 patchset, but I can neither find it in a
> lkml archive, nor in my lkml subscription. Would you please send me your
> V8 patchset, if convenient? Or a public git tree works too!
> 
> Regards,
> Boqun
> 
> On Tue, Jun 09, 2015 at 09:21:43AM +0800, Yuyang Du wrote:
> > Ping ...
> > 
> > Plus some data tested by LKP:
> > 
> > To name a few host configurations:
> > 
> > host: brickland3
> > model: Brickland Ivy Bridge-EX
> > nr_cpu: 120
> > memory: 512G
> > 
> > host: lkp-a03
> > model: Atom
> > memory: 8G
> > 
> > host: grantley
> > model: Grantley Haswell-EP
> > memory: 32G
> > 
> > host: ivb43
> > model: Ivytown Ivy Bridge-EP
> > nr_cpu: 48
> > memory: 64G
> > 
> [snip]



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v8 2/4] sched: Rewrite runnable load and utilization average tracking
  2015-06-15 19:26 [Resend PATCH " Yuyang Du
@ 2015-06-15 19:26 ` Yuyang Du
  2015-06-19  6:00   ` Boqun Feng
  0 siblings, 1 reply; 13+ messages in thread
From: Yuyang Du @ 2015-06-15 19:26 UTC (permalink / raw)
  To: mingo, peterz, linux-kernel
  Cc: pjt, bsegall, morten.rasmussen, vincent.guittot, dietmar.eggemann,
	arjan.van.de.ven, len.brown, rafael.j.wysocki, fengguang.wu,
	boqun.feng, srikar, Yuyang Du

The idea of runnable load average (let runnable time contribute to weight)
was proposed by Paul Turner, and it is still followed by this rewrite. This
rewrite aims to solve the following issues:

1. cfs_rq's load average (namely runnable_load_avg and blocked_load_avg) is
   updated at the granularity of an entity at a time, which results in the
   cfs_rq's load average is stale or partially updated: at any time, only
   one entity is up to date, all other entities are effectively lagging
   behind. This is undesirable.

   To illustrate, if we have n runnable entities in the cfs_rq, as time
   elapses, they certainly become outdated:

   t0: cfs_rq { e1_old, e2_old, ..., en_old }

   and when we update:

   t1: update e1, then we have cfs_rq { e1_new, e2_old, ..., en_old }

   t2: update e2, then we have cfs_rq { e1_old, e2_new, ..., en_old }

   ...

   We solve this by combining all runnable entities' load averages together
   in cfs_rq's avg, and update the cfs_rq's avg as a whole. This is based
   on the fact that if we regard the update as a function, then:

   w * update(e) = update(w * e) and

   update(e1) + update(e2) = update(e1 + e2), then

   w1 * update(e1) + w2 * update(e2) = update(w1 * e1 + w2 * e2)

   therefore, by this rewrite, we have an entirely updated cfs_rq at the
   time we update it:

   t1: update cfs_rq { e1_new, e2_new, ..., en_new }

   t2: update cfs_rq { e1_new, e2_new, ..., en_new }

   ...

2. cfs_rq's load average is different between top rq->cfs_rq and other
   task_group's per CPU cfs_rqs in whether or not blocked_load_average
   contributes to the load.

   The basic idea behind runnable load average (the same for utilization)
   is that the blocked state is taken into account as opposed to only
   accounting for the currently runnable state. Therefore, the average
   should include both the runnable/running and blocked load averages.
   This rewrite does that.

   In addition, we also combine runnable/running and blocked averages
   of all entities into the cfs_rq's average, and update it together at
   once. This is based on the fact that:

   update(runnable) + update(blocked) = update(runnable + blocked)

   This significantly reduces the codes as we don't need to separately
   maintain/update runnable/running load and blocked load.

3. How task_group entities' share is calculated is complex.

   We reduce the complexity in this rewrite to allow a very simple rule:
   the task_group's load_avg is aggregated from its per CPU cfs_rqs's
   load_avgs. Then group entity's weight is simply proportional to its
   own cfs_rq's load_avg / task_group's load_avg. To illustrate,

   if a task_group has { cfs_rq1, cfs_rq2, ..., cfs_rqn }, then,

   task_group_avg = cfs_rq1_avg + cfs_rq2_avg + ... + cfs_rqn_avg, then

   cfs_rqx's entity's share = cfs_rqx_avg / task_group_avg * task_group's share

To sum up, this rewrite in principle is equivalent to the current one, but
fixes the issues described above. Turns out, it significantly reduces the
code complexity and hence increases clarity and efficiency. In addition,
the new averages are more smooth/continuous (no spurious spikes and valleys)
and updated more consistently and quickly to reflect the load dynamics. As
a result, we have less load tracking overhead, better performance, and
especially better power efficiency due to more balanced load.

Signed-off-by: Yuyang Du <yuyang.du@intel.com>
---
 include/linux/sched.h |   40 ++--
 kernel/sched/core.c   |    3 -
 kernel/sched/debug.c  |   35 +--
 kernel/sched/fair.c   |  625 ++++++++++++++++---------------------------------
 kernel/sched/sched.h  |   28 +--
 5 files changed, 240 insertions(+), 491 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index af0eeba..8b4bc4f 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1183,29 +1183,23 @@ struct load_weight {
 	u32 inv_weight;
 };
 
+/*
+ * The load_avg/util_avg represents an infinite geometric series:
+ * 1) load_avg describes the amount of time that a sched_entity
+ * is runnable on a rq. It is based on both load_sum and the
+ * weight of the task.
+ * 2) util_avg describes the amount of time that a sched_entity
+ * is running on a CPU. It is based on util_sum and is scaled
+ * in the range [0..SCHED_LOAD_SCALE].
+ * The 64 bit load_sum can:
+ * 1) for cfs_rq, afford 4353082796 (=2^64/47742/88761) entities with
+ * the highest weight (=88761) always runnable, we should not overflow
+ * 2) for entity, support any load.weight always runnable
+ */
 struct sched_avg {
-	u64 last_runnable_update;
-	s64 decay_count;
-	/*
-	 * utilization_avg_contrib describes the amount of time that a
-	 * sched_entity is running on a CPU. It is based on running_avg_sum
-	 * and is scaled in the range [0..SCHED_LOAD_SCALE].
-	 * load_avg_contrib described the amount of time that a sched_entity
-	 * is runnable on a rq. It is based on both runnable_avg_sum and the
-	 * weight of the task.
-	 */
-	unsigned long load_avg_contrib, utilization_avg_contrib;
-	/*
-	 * These sums represent an infinite geometric series and so are bound
-	 * above by 1024/(1-y).  Thus we only need a u32 to store them for all
-	 * choices of y < 1-2^(-32)*1024.
-	 * running_avg_sum reflects the time that the sched_entity is
-	 * effectively running on the CPU.
-	 * runnable_avg_sum represents the amount of time a sched_entity is on
-	 * a runqueue which includes the running time that is monitored by
-	 * running_avg_sum.
-	 */
-	u32 runnable_avg_sum, avg_period, running_avg_sum;
+	u64 last_update_time, load_sum;
+	u32 util_sum, period_contrib;
+	unsigned long load_avg, util_avg;
 };
 
 #ifdef CONFIG_SCHEDSTATS
@@ -1271,7 +1265,7 @@ struct sched_entity {
 #endif
 
 #ifdef CONFIG_SMP
-	/* Per-entity load-tracking */
+	/* Per entity load average tracking */
 	struct sched_avg	avg;
 #endif
 };
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 921a754..724de5b 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1828,9 +1828,6 @@ static void __sched_fork(unsigned long clone_flags, struct task_struct *p)
 	p->se.prev_sum_exec_runtime	= 0;
 	p->se.nr_migrations		= 0;
 	p->se.vruntime			= 0;
-#ifdef CONFIG_SMP
-	p->se.avg.decay_count		= 0;
-#endif
 	INIT_LIST_HEAD(&p->se.group_node);
 
 #ifdef CONFIG_SCHEDSTATS
diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
index ca39cb7..db3e875 100644
--- a/kernel/sched/debug.c
+++ b/kernel/sched/debug.c
@@ -88,12 +88,8 @@ static void print_cfs_group_stats(struct seq_file *m, int cpu, struct task_group
 #endif
 	P(se->load.weight);
 #ifdef CONFIG_SMP
-	P(se->avg.runnable_avg_sum);
-	P(se->avg.running_avg_sum);
-	P(se->avg.avg_period);
-	P(se->avg.load_avg_contrib);
-	P(se->avg.utilization_avg_contrib);
-	P(se->avg.decay_count);
+	P(se->avg.load_avg);
+	P(se->avg.util_avg);
 #endif
 #undef PN
 #undef P
@@ -207,21 +203,13 @@ void print_cfs_rq(struct seq_file *m, int cpu, struct cfs_rq *cfs_rq)
 	SEQ_printf(m, "  .%-30s: %d\n", "nr_running", cfs_rq->nr_running);
 	SEQ_printf(m, "  .%-30s: %ld\n", "load", cfs_rq->load.weight);
 #ifdef CONFIG_SMP
-	SEQ_printf(m, "  .%-30s: %ld\n", "runnable_load_avg",
-			cfs_rq->runnable_load_avg);
-	SEQ_printf(m, "  .%-30s: %ld\n", "blocked_load_avg",
-			cfs_rq->blocked_load_avg);
-	SEQ_printf(m, "  .%-30s: %ld\n", "utilization_load_avg",
-			cfs_rq->utilization_load_avg);
+	SEQ_printf(m, "  .%-30s: %lu\n", "load_avg",
+			cfs_rq->avg.load_avg);
+	SEQ_printf(m, "  .%-30s: %lu\n", "util_avg",
+			cfs_rq->avg.util_avg);
 #ifdef CONFIG_FAIR_GROUP_SCHED
-	SEQ_printf(m, "  .%-30s: %ld\n", "tg_load_contrib",
-			cfs_rq->tg_load_contrib);
-	SEQ_printf(m, "  .%-30s: %d\n", "tg_runnable_contrib",
-			cfs_rq->tg_runnable_contrib);
 	SEQ_printf(m, "  .%-30s: %ld\n", "tg_load_avg",
 			atomic_long_read(&cfs_rq->tg->load_avg));
-	SEQ_printf(m, "  .%-30s: %d\n", "tg->runnable_avg",
-			atomic_read(&cfs_rq->tg->runnable_avg));
 #endif
 #endif
 #ifdef CONFIG_CFS_BANDWIDTH
@@ -632,12 +620,11 @@ void proc_sched_show_task(struct task_struct *p, struct seq_file *m)
 
 	P(se.load.weight);
 #ifdef CONFIG_SMP
-	P(se.avg.runnable_avg_sum);
-	P(se.avg.running_avg_sum);
-	P(se.avg.avg_period);
-	P(se.avg.load_avg_contrib);
-	P(se.avg.utilization_avg_contrib);
-	P(se.avg.decay_count);
+	P(se.avg.load_sum);
+	P(se.avg.util_sum);
+	P(se.avg.load_avg);
+	P(se.avg.util_avg);
+	P(se.avg.last_update_time);
 #endif
 	P(policy);
 	P(prio);
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 56c1b94..f336f6e 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -283,9 +283,6 @@ static inline struct cfs_rq *group_cfs_rq(struct sched_entity *grp)
 	return grp->my_q;
 }
 
-static void update_cfs_rq_blocked_load(struct cfs_rq *cfs_rq,
-				       int force_update);
-
 static inline void list_add_leaf_cfs_rq(struct cfs_rq *cfs_rq)
 {
 	if (!cfs_rq->on_list) {
@@ -305,8 +302,6 @@ static inline void list_add_leaf_cfs_rq(struct cfs_rq *cfs_rq)
 		}
 
 		cfs_rq->on_list = 1;
-		/* We should have no load, but we need to update last_decay. */
-		update_cfs_rq_blocked_load(cfs_rq, 0);
 	}
 }
 
@@ -669,19 +664,31 @@ static u64 sched_vslice(struct cfs_rq *cfs_rq, struct sched_entity *se)
 static int select_idle_sibling(struct task_struct *p, int cpu);
 static unsigned long task_h_load(struct task_struct *p);
 
-static inline void __update_task_entity_contrib(struct sched_entity *se);
-static inline void __update_task_entity_utilization(struct sched_entity *se);
+/*
+ * We choose a half-life close to 1 scheduling period.
+ * Note: The tables below are dependent on this value.
+ */
+#define LOAD_AVG_PERIOD 32
+#define LOAD_AVG_MAX 47742 /* maximum possible load avg */
+#define LOAD_AVG_MAX_N 345 /* number of full periods to produce LOAD_MAX_AVG */
 
 /* Give new task start runnable values to heavy its load in infant time */
 void init_task_runnable_average(struct task_struct *p)
 {
-	u32 slice;
+	struct sched_avg *sa = &p->se.avg;
 
-	slice = sched_slice(task_cfs_rq(p), &p->se) >> 10;
-	p->se.avg.runnable_avg_sum = p->se.avg.running_avg_sum = slice;
-	p->se.avg.avg_period = slice;
-	__update_task_entity_contrib(&p->se);
-	__update_task_entity_utilization(&p->se);
+	sa->last_update_time = 0;
+	/*
+	 * sched_avg's period_contrib should be strictly less then 1024, so
+	 * we give it 1023 to make sure it is almost a period (1024us), and
+	 * will definitely be update (after enqueue).
+	 */
+	sa->period_contrib = 1023;
+	sa->load_avg = scale_load_down(p->se.load.weight);
+	sa->load_sum = sa->load_avg * LOAD_AVG_MAX;
+	sa->util_avg = scale_load_down(SCHED_LOAD_SCALE);
+	sa->util_sum = sa->util_avg * LOAD_AVG_MAX;
+	/* when this task enqueue'ed, it will contribute to its cfs_rq's load_avg */
 }
 #else
 void init_task_runnable_average(struct task_struct *p)
@@ -1687,8 +1694,8 @@ static u64 numa_get_avg_runtime(struct task_struct *p, u64 *period)
 		delta = runtime - p->last_sum_exec_runtime;
 		*period = now - p->last_task_numa_placement;
 	} else {
-		delta = p->se.avg.runnable_avg_sum;
-		*period = p->se.avg.avg_period;
+		delta = p->se.avg.load_sum / p->se.load.weight;
+		*period = LOAD_AVG_MAX;
 	}
 
 	p->last_sum_exec_runtime = runtime;
@@ -2336,13 +2343,13 @@ static inline long calc_tg_weight(struct task_group *tg, struct cfs_rq *cfs_rq)
 	long tg_weight;
 
 	/*
-	 * Use this CPU's actual weight instead of the last load_contribution
-	 * to gain a more accurate current total weight. See
-	 * update_cfs_rq_load_contribution().
+	 * Use this CPU's real-time load instead of the last load contribution
+	 * as the updating of the contribution is delayed, and we will use the
+	 * the real-time load to calc the share. See update_tg_load_avg().
 	 */
 	tg_weight = atomic_long_read(&tg->load_avg);
-	tg_weight -= cfs_rq->tg_load_contrib;
-	tg_weight += cfs_rq->load.weight;
+	tg_weight -= cfs_rq->tg_load_avg_contrib;
+	tg_weight += cfs_rq->avg.load_avg;
 
 	return tg_weight;
 }
@@ -2352,7 +2359,7 @@ static long calc_cfs_shares(struct cfs_rq *cfs_rq, struct task_group *tg)
 	long tg_weight, load, shares;
 
 	tg_weight = calc_tg_weight(tg, cfs_rq);
-	load = cfs_rq->load.weight;
+	load = cfs_rq->avg.load_avg;
 
 	shares = (tg->shares * load);
 	if (tg_weight)
@@ -2414,14 +2421,6 @@ static inline void update_cfs_shares(struct cfs_rq *cfs_rq)
 #endif /* CONFIG_FAIR_GROUP_SCHED */
 
 #ifdef CONFIG_SMP
-/*
- * We choose a half-life close to 1 scheduling period.
- * Note: The tables below are dependent on this value.
- */
-#define LOAD_AVG_PERIOD 32
-#define LOAD_AVG_MAX 47742 /* maximum possible load avg */
-#define LOAD_AVG_MAX_N 345 /* number of full periods to produce LOAD_MAX_AVG */
-
 /* Precomputed fixed inverse multiplies for multiplication by y^n */
 static const u32 runnable_avg_yN_inv[] = {
 	0xffffffff, 0xfa83b2da, 0xf5257d14, 0xefe4b99a, 0xeac0c6e6, 0xe5b906e6,
@@ -2470,9 +2469,8 @@ static __always_inline u64 decay_load(u64 val, u64 n)
 		local_n %= LOAD_AVG_PERIOD;
 	}
 
-	val *= runnable_avg_yN_inv[local_n];
-	/* We don't use SRR here since we always want to round down. */
-	return val >> 32;
+	val = mul_u64_u32_shr(val, runnable_avg_yN_inv[local_n], 32);
+	return val;
 }
 
 /*
@@ -2531,23 +2529,23 @@ static u32 __compute_runnable_contrib(u64 n)
  *   load_avg = u_0` + y*(u_0 + u_1*y + u_2*y^2 + ... )
  *            = u_0 + u_1*y + u_2*y^2 + ... [re-labeling u_i --> u_{i+1}]
  */
-static __always_inline int __update_entity_runnable_avg(u64 now, int cpu,
+static __always_inline int __update_load_avg(u64 now, int cpu,
 							struct sched_avg *sa,
-							int runnable,
+							unsigned long weight,
 							int running)
 {
 	u64 delta, periods;
-	u32 runnable_contrib;
+	u32 contrib;
 	int delta_w, decayed = 0;
 	unsigned long scale_freq = arch_scale_freq_capacity(NULL, cpu);
 
-	delta = now - sa->last_runnable_update;
+	delta = now - sa->last_update_time;
 	/*
 	 * This should only happen when time goes backwards, which it
 	 * unfortunately does during sched clock init when we swap over to TSC.
 	 */
 	if ((s64)delta < 0) {
-		sa->last_runnable_update = now;
+		sa->last_update_time = now;
 		return 0;
 	}
 
@@ -2558,26 +2556,26 @@ static __always_inline int __update_entity_runnable_avg(u64 now, int cpu,
 	delta >>= 10;
 	if (!delta)
 		return 0;
-	sa->last_runnable_update = now;
+	sa->last_update_time = now;
 
 	/* delta_w is the amount already accumulated against our next period */
-	delta_w = sa->avg_period % 1024;
+	delta_w = sa->period_contrib;
 	if (delta + delta_w >= 1024) {
-		/* period roll-over */
 		decayed = 1;
 
+		/* how much left for next period will start over, we don't know yet */
+		sa->period_contrib = 0;
+
 		/*
 		 * Now that we know we're crossing a period boundary, figure
 		 * out how much from delta we need to complete the current
 		 * period and accrue it.
 		 */
 		delta_w = 1024 - delta_w;
-		if (runnable)
-			sa->runnable_avg_sum += delta_w;
+		if (weight)
+			sa->load_sum += weight * delta_w;
 		if (running)
-			sa->running_avg_sum += delta_w * scale_freq
-				>> SCHED_CAPACITY_SHIFT;
-		sa->avg_period += delta_w;
+			sa->util_sum += delta_w * scale_freq >> SCHED_CAPACITY_SHIFT;
 
 		delta -= delta_w;
 
@@ -2585,334 +2583,156 @@ static __always_inline int __update_entity_runnable_avg(u64 now, int cpu,
 		periods = delta / 1024;
 		delta %= 1024;
 
-		sa->runnable_avg_sum = decay_load(sa->runnable_avg_sum,
-						  periods + 1);
-		sa->running_avg_sum = decay_load(sa->running_avg_sum,
-						  periods + 1);
-		sa->avg_period = decay_load(sa->avg_period,
-						     periods + 1);
+		sa->load_sum = decay_load(sa->load_sum, periods + 1);
+		sa->util_sum = decay_load((u64)(sa->util_sum), periods + 1);
 
 		/* Efficiently calculate \sum (1..n_period) 1024*y^i */
-		runnable_contrib = __compute_runnable_contrib(periods);
-		if (runnable)
-			sa->runnable_avg_sum += runnable_contrib;
+		contrib = __compute_runnable_contrib(periods);
+		if (weight)
+			sa->load_sum += weight * contrib;
 		if (running)
-			sa->running_avg_sum += runnable_contrib * scale_freq
-				>> SCHED_CAPACITY_SHIFT;
-		sa->avg_period += runnable_contrib;
+			sa->util_sum += contrib * scale_freq >> SCHED_CAPACITY_SHIFT;
 	}
 
 	/* Remainder of delta accrued against u_0` */
-	if (runnable)
-		sa->runnable_avg_sum += delta;
+	if (weight)
+		sa->load_sum += weight * delta;
 	if (running)
-		sa->running_avg_sum += delta * scale_freq
-			>> SCHED_CAPACITY_SHIFT;
-	sa->avg_period += delta;
-
-	return decayed;
-}
-
-/* Synchronize an entity's decay with its parenting cfs_rq.*/
-static inline u64 __synchronize_entity_decay(struct sched_entity *se)
-{
-	struct cfs_rq *cfs_rq = cfs_rq_of(se);
-	u64 decays = atomic64_read(&cfs_rq->decay_counter);
+		sa->util_sum += delta * scale_freq >> SCHED_CAPACITY_SHIFT;
 
-	decays -= se->avg.decay_count;
-	se->avg.decay_count = 0;
-	if (!decays)
-		return 0;
+	sa->period_contrib += delta;
 
-	se->avg.load_avg_contrib = decay_load(se->avg.load_avg_contrib, decays);
-	se->avg.utilization_avg_contrib =
-		decay_load(se->avg.utilization_avg_contrib, decays);
+	if (decayed) {
+		sa->load_avg = div_u64(sa->load_sum, LOAD_AVG_MAX);
+		sa->util_avg = (sa->util_sum << SCHED_LOAD_SHIFT) / LOAD_AVG_MAX;
+	}
 
-	return decays;
+	return decayed;
 }
 
 #ifdef CONFIG_FAIR_GROUP_SCHED
-static inline void __update_cfs_rq_tg_load_contrib(struct cfs_rq *cfs_rq,
-						 int force_update)
-{
-	struct task_group *tg = cfs_rq->tg;
-	long tg_contrib;
-
-	tg_contrib = cfs_rq->runnable_load_avg + cfs_rq->blocked_load_avg;
-	tg_contrib -= cfs_rq->tg_load_contrib;
-
-	if (!tg_contrib)
-		return;
-
-	if (force_update || abs(tg_contrib) > cfs_rq->tg_load_contrib / 8) {
-		atomic_long_add(tg_contrib, &tg->load_avg);
-		cfs_rq->tg_load_contrib += tg_contrib;
-	}
-}
-
 /*
- * Aggregate cfs_rq runnable averages into an equivalent task_group
- * representation for computing load contributions.
+ * Updating tg's load_avg is necessary before update_cfs_share (which is done)
+ * and effective_load (which is not done because it is too costly).
  */
-static inline void __update_tg_runnable_avg(struct sched_avg *sa,
-						  struct cfs_rq *cfs_rq)
+static inline void update_tg_load_avg(struct cfs_rq *cfs_rq, int force)
 {
-	struct task_group *tg = cfs_rq->tg;
-	long contrib;
-
-	/* The fraction of a cpu used by this cfs_rq */
-	contrib = div_u64((u64)sa->runnable_avg_sum << NICE_0_SHIFT,
-			  sa->avg_period + 1);
-	contrib -= cfs_rq->tg_runnable_contrib;
+	long delta = cfs_rq->avg.load_avg - cfs_rq->tg_load_avg_contrib;
 
-	if (abs(contrib) > cfs_rq->tg_runnable_contrib / 64) {
-		atomic_add(contrib, &tg->runnable_avg);
-		cfs_rq->tg_runnable_contrib += contrib;
-	}
-}
-
-static inline void __update_group_entity_contrib(struct sched_entity *se)
-{
-	struct cfs_rq *cfs_rq = group_cfs_rq(se);
-	struct task_group *tg = cfs_rq->tg;
-	int runnable_avg;
-
-	u64 contrib;
-
-	contrib = cfs_rq->tg_load_contrib * tg->shares;
-	se->avg.load_avg_contrib = div_u64(contrib,
-				     atomic_long_read(&tg->load_avg) + 1);
-
-	/*
-	 * For group entities we need to compute a correction term in the case
-	 * that they are consuming <1 cpu so that we would contribute the same
-	 * load as a task of equal weight.
-	 *
-	 * Explicitly co-ordinating this measurement would be expensive, but
-	 * fortunately the sum of each cpus contribution forms a usable
-	 * lower-bound on the true value.
-	 *
-	 * Consider the aggregate of 2 contributions.  Either they are disjoint
-	 * (and the sum represents true value) or they are disjoint and we are
-	 * understating by the aggregate of their overlap.
-	 *
-	 * Extending this to N cpus, for a given overlap, the maximum amount we
-	 * understand is then n_i(n_i+1)/2 * w_i where n_i is the number of
-	 * cpus that overlap for this interval and w_i is the interval width.
-	 *
-	 * On a small machine; the first term is well-bounded which bounds the
-	 * total error since w_i is a subset of the period.  Whereas on a
-	 * larger machine, while this first term can be larger, if w_i is the
-	 * of consequential size guaranteed to see n_i*w_i quickly converge to
-	 * our upper bound of 1-cpu.
-	 */
-	runnable_avg = atomic_read(&tg->runnable_avg);
-	if (runnable_avg < NICE_0_LOAD) {
-		se->avg.load_avg_contrib *= runnable_avg;
-		se->avg.load_avg_contrib >>= NICE_0_SHIFT;
+	if (force || abs(delta) > cfs_rq->tg_load_avg_contrib / 64) {
+		atomic_long_add(delta, &cfs_rq->tg->load_avg);
+		cfs_rq->tg_load_avg_contrib = cfs_rq->avg.load_avg;
 	}
 }
 
 #else /* CONFIG_FAIR_GROUP_SCHED */
-static inline void __update_cfs_rq_tg_load_contrib(struct cfs_rq *cfs_rq,
-						 int force_update) {}
-static inline void __update_tg_runnable_avg(struct sched_avg *sa,
-						  struct cfs_rq *cfs_rq) {}
-static inline void __update_group_entity_contrib(struct sched_entity *se) {}
+static inline void update_tg_load_avg(struct cfs_rq *cfs_rq, int force) {}
 #endif /* CONFIG_FAIR_GROUP_SCHED */
 
-static inline void __update_task_entity_contrib(struct sched_entity *se)
-{
-	u32 contrib;
-
-	/* avoid overflowing a 32-bit type w/ SCHED_LOAD_SCALE */
-	contrib = se->avg.runnable_avg_sum * scale_load_down(se->load.weight);
-	contrib /= (se->avg.avg_period + 1);
-	se->avg.load_avg_contrib = scale_load(contrib);
-}
+static inline u64 cfs_rq_clock_task(struct cfs_rq *cfs_rq);
 
-/* Compute the current contribution to load_avg by se, return any delta */
-static long __update_entity_load_avg_contrib(struct sched_entity *se)
+/* Group cfs_rq's load_avg is used for task_h_load and update_cfs_share */
+static inline int update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq)
 {
-	long old_contrib = se->avg.load_avg_contrib;
+	int decayed;
 
-	if (entity_is_task(se)) {
-		__update_task_entity_contrib(se);
-	} else {
-		__update_tg_runnable_avg(&se->avg, group_cfs_rq(se));
-		__update_group_entity_contrib(se);
+	if (atomic_long_read(&cfs_rq->removed_load_avg)) {
+		long r = atomic_long_xchg(&cfs_rq->removed_load_avg, 0);
+		cfs_rq->avg.load_avg = max_t(long, cfs_rq->avg.load_avg - r, 0);
+		cfs_rq->avg.load_sum =
+			max_t(s64, cfs_rq->avg.load_sum - r * LOAD_AVG_MAX, 0);
 	}
 
-	return se->avg.load_avg_contrib - old_contrib;
-}
-
-
-static inline void __update_task_entity_utilization(struct sched_entity *se)
-{
-	u32 contrib;
-
-	/* avoid overflowing a 32-bit type w/ SCHED_LOAD_SCALE */
-	contrib = se->avg.running_avg_sum * scale_load_down(SCHED_LOAD_SCALE);
-	contrib /= (se->avg.avg_period + 1);
-	se->avg.utilization_avg_contrib = scale_load(contrib);
-}
+	if (atomic_long_read(&cfs_rq->removed_util_avg)) {
+		long r = atomic_long_xchg(&cfs_rq->removed_util_avg, 0);
+		cfs_rq->avg.util_avg = max_t(long, cfs_rq->avg.util_avg - r, 0);
+		cfs_rq->avg.util_sum =
+			max_t(s32, cfs_rq->avg.util_sum - r * LOAD_AVG_MAX, 0);
+	}
 
-static long __update_entity_utilization_avg_contrib(struct sched_entity *se)
-{
-	long old_contrib = se->avg.utilization_avg_contrib;
+	decayed = __update_load_avg(now, cpu_of(rq_of(cfs_rq)), &cfs_rq->avg,
+		scale_load_down(cfs_rq->load.weight), cfs_rq->curr != NULL);
 
-	if (entity_is_task(se))
-		__update_task_entity_utilization(se);
-	else
-		se->avg.utilization_avg_contrib =
-					group_cfs_rq(se)->utilization_load_avg;
-
-	return se->avg.utilization_avg_contrib - old_contrib;
-}
+#ifndef CONFIG_64BIT
+	smp_wmb();
+	cfs_rq->load_last_update_time_copy = cfs_rq->avg.last_update_time;
+#endif
 
-static inline void subtract_blocked_load_contrib(struct cfs_rq *cfs_rq,
-						 long load_contrib)
-{
-	if (likely(load_contrib < cfs_rq->blocked_load_avg))
-		cfs_rq->blocked_load_avg -= load_contrib;
-	else
-		cfs_rq->blocked_load_avg = 0;
+	return decayed;
 }
 
-static inline u64 cfs_rq_clock_task(struct cfs_rq *cfs_rq);
-
-/* Update a sched_entity's runnable average */
-static inline void update_entity_load_avg(struct sched_entity *se,
-					  int update_cfs_rq)
+/* Update task and its cfs_rq load average */
+static inline void update_load_avg(struct sched_entity *se, int update_tg)
 {
 	struct cfs_rq *cfs_rq = cfs_rq_of(se);
-	long contrib_delta, utilization_delta;
 	int cpu = cpu_of(rq_of(cfs_rq));
-	u64 now;
+	u64 now = cfs_rq_clock_task(cfs_rq);
 
 	/*
-	 * For a group entity we need to use their owned cfs_rq_clock_task() in
-	 * case they are the parent of a throttled hierarchy.
+	 * Track task load average for carrying it to new CPU after migrated, and
+	 * track group sched_entity load average for task_h_load calc in migration
 	 */
-	if (entity_is_task(se))
-		now = cfs_rq_clock_task(cfs_rq);
-	else
-		now = cfs_rq_clock_task(group_cfs_rq(se));
+	__update_load_avg(now, cpu, &se->avg,
+		se->on_rq * scale_load_down(se->load.weight), cfs_rq->curr == se);
 
-	if (!__update_entity_runnable_avg(now, cpu, &se->avg, se->on_rq,
-					cfs_rq->curr == se))
-		return;
-
-	contrib_delta = __update_entity_load_avg_contrib(se);
-	utilization_delta = __update_entity_utilization_avg_contrib(se);
-
-	if (!update_cfs_rq)
-		return;
-
-	if (se->on_rq) {
-		cfs_rq->runnable_load_avg += contrib_delta;
-		cfs_rq->utilization_load_avg += utilization_delta;
-	} else {
-		subtract_blocked_load_contrib(cfs_rq, -contrib_delta);
-	}
+	if (update_cfs_rq_load_avg(now, cfs_rq) && update_tg)
+		update_tg_load_avg(cfs_rq, 0);
 }
 
-/*
- * Decay the load contributed by all blocked children and account this so that
- * their contribution may appropriately discounted when they wake up.
- */
-static void update_cfs_rq_blocked_load(struct cfs_rq *cfs_rq, int force_update)
+/* Add the load generated by se into cfs_rq's load average */
+static inline void
+enqueue_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se)
 {
-	u64 now = cfs_rq_clock_task(cfs_rq) >> 20;
-	u64 decays;
-
-	decays = now - cfs_rq->last_decay;
-	if (!decays && !force_update)
-		return;
+	struct sched_avg *sa = &se->avg;
+	u64 now = cfs_rq_clock_task(cfs_rq);
+	int migrated = 0, decayed;
 
-	if (atomic_long_read(&cfs_rq->removed_load)) {
-		unsigned long removed_load;
-		removed_load = atomic_long_xchg(&cfs_rq->removed_load, 0);
-		subtract_blocked_load_contrib(cfs_rq, removed_load);
+	if (sa->last_update_time == 0) {
+		sa->last_update_time = now;
+		migrated = 1;
 	}
-
-	if (decays) {
-		cfs_rq->blocked_load_avg = decay_load(cfs_rq->blocked_load_avg,
-						      decays);
-		atomic64_add(decays, &cfs_rq->decay_counter);
-		cfs_rq->last_decay = now;
+	else {
+		__update_load_avg(now, cpu_of(rq_of(cfs_rq)), sa,
+			se->on_rq * scale_load_down(se->load.weight), cfs_rq->curr == se);
 	}
 
-	__update_cfs_rq_tg_load_contrib(cfs_rq, force_update);
-}
+	decayed = update_cfs_rq_load_avg(now, cfs_rq);
 
-/* Add the load generated by se into cfs_rq's child load-average */
-static inline void enqueue_entity_load_avg(struct cfs_rq *cfs_rq,
-						  struct sched_entity *se,
-						  int wakeup)
-{
-	/*
-	 * We track migrations using entity decay_count <= 0, on a wake-up
-	 * migration we use a negative decay count to track the remote decays
-	 * accumulated while sleeping.
-	 *
-	 * Newly forked tasks are enqueued with se->avg.decay_count == 0, they
-	 * are seen by enqueue_entity_load_avg() as a migration with an already
-	 * constructed load_avg_contrib.
-	 */
-	if (unlikely(se->avg.decay_count <= 0)) {
-		se->avg.last_runnable_update = rq_clock_task(rq_of(cfs_rq));
-		if (se->avg.decay_count) {
-			/*
-			 * In a wake-up migration we have to approximate the
-			 * time sleeping.  This is because we can't synchronize
-			 * clock_task between the two cpus, and it is not
-			 * guaranteed to be read-safe.  Instead, we can
-			 * approximate this using our carried decays, which are
-			 * explicitly atomically readable.
-			 */
-			se->avg.last_runnable_update -= (-se->avg.decay_count)
-							<< 20;
-			update_entity_load_avg(se, 0);
-			/* Indicate that we're now synchronized and on-rq */
-			se->avg.decay_count = 0;
-		}
-		wakeup = 0;
-	} else {
-		__synchronize_entity_decay(se);
+	if (migrated) {
+		cfs_rq->avg.load_avg += sa->load_avg;
+		cfs_rq->avg.load_sum += sa->load_sum;
+		cfs_rq->avg.util_avg += sa->util_avg;
+		cfs_rq->avg.util_sum += sa->util_sum;
 	}
 
-	/* migrated tasks did not contribute to our blocked load */
-	if (wakeup) {
-		subtract_blocked_load_contrib(cfs_rq, se->avg.load_avg_contrib);
-		update_entity_load_avg(se, 0);
-	}
-
-	cfs_rq->runnable_load_avg += se->avg.load_avg_contrib;
-	cfs_rq->utilization_load_avg += se->avg.utilization_avg_contrib;
-	/* we force update consideration on load-balancer moves */
-	update_cfs_rq_blocked_load(cfs_rq, !wakeup);
+	if (decayed || migrated)
+		update_tg_load_avg(cfs_rq, 0);
 }
 
 /*
- * Remove se's load from this cfs_rq child load-average, if the entity is
- * transitioning to a blocked state we track its projected decay using
- * blocked_load_avg.
+ * Task first catches up with cfs_rq, and then subtract
+ * itself from the cfs_rq (task must be off the queue now).
  */
-static inline void dequeue_entity_load_avg(struct cfs_rq *cfs_rq,
-						  struct sched_entity *se,
-						  int sleep)
+void remove_entity_load_avg(struct sched_entity *se)
 {
-	update_entity_load_avg(se, 1);
-	/* we force update consideration on load-balancer moves */
-	update_cfs_rq_blocked_load(cfs_rq, !sleep);
+	struct cfs_rq *cfs_rq = cfs_rq_of(se);
+	u64 last_update_time;
+
+#ifndef CONFIG_64BIT
+	u64 last_update_time_copy;
 
-	cfs_rq->runnable_load_avg -= se->avg.load_avg_contrib;
-	cfs_rq->utilization_load_avg -= se->avg.utilization_avg_contrib;
-	if (sleep) {
-		cfs_rq->blocked_load_avg += se->avg.load_avg_contrib;
-		se->avg.decay_count = atomic64_read(&cfs_rq->decay_counter);
-	} /* migrations, e.g. sleep=0 leave decay_count == 0 */
+	do {
+		last_update_time_copy = cfs_rq->load_last_update_time_copy;
+		smp_rmb();
+		last_update_time = cfs_rq->avg.last_update_time;
+	} while (last_update_time != last_update_time_copy);
+#else
+	last_update_time = cfs_rq->avg.last_update_time;
+#endif
+
+	__update_load_avg(last_update_time, cpu_of(rq_of(cfs_rq)), &se->avg, 0, 0);
+	atomic_long_add(se->avg.load_avg, &cfs_rq->removed_load_avg);
+	atomic_long_add(se->avg.util_avg, &cfs_rq->removed_util_avg);
 }
 
 /*
@@ -2937,16 +2757,10 @@ static int idle_balance(struct rq *this_rq);
 
 #else /* CONFIG_SMP */
 
-static inline void update_entity_load_avg(struct sched_entity *se,
-					  int update_cfs_rq) {}
-static inline void enqueue_entity_load_avg(struct cfs_rq *cfs_rq,
-					   struct sched_entity *se,
-					   int wakeup) {}
-static inline void dequeue_entity_load_avg(struct cfs_rq *cfs_rq,
-					   struct sched_entity *se,
-					   int sleep) {}
-static inline void update_cfs_rq_blocked_load(struct cfs_rq *cfs_rq,
-					      int force_update) {}
+static inline void update_load_avg(struct sched_entity *se, int update_tg) {}
+static inline void
+enqueue_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se) {}
+static inline void remove_entity_load_avg(struct sched_entity *se) {}
 
 static inline int idle_balance(struct rq *rq)
 {
@@ -3078,7 +2892,7 @@ enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
 	 * Update run-time statistics of the 'current'.
 	 */
 	update_curr(cfs_rq);
-	enqueue_entity_load_avg(cfs_rq, se, flags & ENQUEUE_WAKEUP);
+	enqueue_entity_load_avg(cfs_rq, se);
 	account_entity_enqueue(cfs_rq, se);
 	update_cfs_shares(cfs_rq);
 
@@ -3153,7 +2967,7 @@ dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
 	 * Update run-time statistics of the 'current'.
 	 */
 	update_curr(cfs_rq);
-	dequeue_entity_load_avg(cfs_rq, se, flags & DEQUEUE_SLEEP);
+	update_load_avg(se, 1);
 
 	update_stats_dequeue(cfs_rq, se);
 	if (flags & DEQUEUE_SLEEP) {
@@ -3243,7 +3057,7 @@ set_next_entity(struct cfs_rq *cfs_rq, struct sched_entity *se)
 		 */
 		update_stats_wait_end(cfs_rq, se);
 		__dequeue_entity(cfs_rq, se);
-		update_entity_load_avg(se, 1);
+		update_load_avg(se, 1);
 	}
 
 	update_stats_curr_start(cfs_rq, se);
@@ -3343,7 +3157,7 @@ static void put_prev_entity(struct cfs_rq *cfs_rq, struct sched_entity *prev)
 		/* Put 'current' back into the tree. */
 		__enqueue_entity(cfs_rq, prev);
 		/* in !on_rq case, update occurred at dequeue */
-		update_entity_load_avg(prev, 1);
+		update_load_avg(prev, 0);
 	}
 	cfs_rq->curr = NULL;
 }
@@ -3359,8 +3173,7 @@ entity_tick(struct cfs_rq *cfs_rq, struct sched_entity *curr, int queued)
 	/*
 	 * Ensure that runnable average is periodically updated.
 	 */
-	update_entity_load_avg(curr, 1);
-	update_cfs_rq_blocked_load(cfs_rq, 1);
+	update_load_avg(curr, 1);
 	update_cfs_shares(cfs_rq);
 
 #ifdef CONFIG_SCHED_HRTICK
@@ -4233,8 +4046,8 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
 		if (cfs_rq_throttled(cfs_rq))
 			break;
 
+		update_load_avg(se, 1);
 		update_cfs_shares(cfs_rq);
-		update_entity_load_avg(se, 1);
 	}
 
 	if (!se)
@@ -4293,8 +4106,8 @@ static void dequeue_task_fair(struct rq *rq, struct task_struct *p, int flags)
 		if (cfs_rq_throttled(cfs_rq))
 			break;
 
+		update_load_avg(se, 1);
 		update_cfs_shares(cfs_rq);
-		update_entity_load_avg(se, 1);
 	}
 
 	if (!se)
@@ -4433,7 +4246,7 @@ static void __update_cpu_load(struct rq *this_rq, unsigned long this_load,
 static void update_idle_cpu_load(struct rq *this_rq)
 {
 	unsigned long curr_jiffies = READ_ONCE(jiffies);
-	unsigned long load = this_rq->cfs.runnable_load_avg;
+	unsigned long load = this_rq->cfs.avg.load_avg;
 	unsigned long pending_updates;
 
 	/*
@@ -4479,7 +4292,7 @@ void update_cpu_load_nohz(void)
  */
 void update_cpu_load_active(struct rq *this_rq)
 {
-	unsigned long load = this_rq->cfs.runnable_load_avg;
+	unsigned long load = this_rq->cfs.avg.load_avg;
 	/*
 	 * See the mess around update_idle_cpu_load() / update_cpu_load_nohz().
 	 */
@@ -4490,7 +4303,7 @@ void update_cpu_load_active(struct rq *this_rq)
 /* Used instead of source_load when we know the type == 0 */
 static unsigned long weighted_cpuload(const int cpu)
 {
-	return cpu_rq(cpu)->cfs.runnable_load_avg;
+	return cpu_rq(cpu)->cfs.avg.load_avg;
 }
 
 /*
@@ -4540,7 +4353,7 @@ static unsigned long cpu_avg_load_per_task(int cpu)
 {
 	struct rq *rq = cpu_rq(cpu);
 	unsigned long nr_running = READ_ONCE(rq->cfs.h_nr_running);
-	unsigned long load_avg = rq->cfs.runnable_load_avg;
+	unsigned long load_avg = rq->cfs.avg.load_avg;
 
 	if (nr_running)
 		return load_avg / nr_running;
@@ -4659,7 +4472,7 @@ static long effective_load(struct task_group *tg, int cpu, long wl, long wg)
 		/*
 		 * w = rw_i + @wl
 		 */
-		w = se->my_q->load.weight + wl;
+		w = se->my_q->avg.load_avg + wl;
 
 		/*
 		 * wl = S * s'_i; see (2)
@@ -4680,7 +4493,7 @@ static long effective_load(struct task_group *tg, int cpu, long wl, long wg)
 		/*
 		 * wl = dw_i = S * (s'_i - s_i); see (3)
 		 */
-		wl -= se->load.weight;
+		wl -= se->avg.load_avg;
 
 		/*
 		 * Recursively apply this logic to all parent groups to compute
@@ -4754,14 +4567,14 @@ static int wake_affine(struct sched_domain *sd, struct task_struct *p, int sync)
 	 */
 	if (sync) {
 		tg = task_group(current);
-		weight = current->se.load.weight;
+		weight = current->se.avg.load_avg;
 
 		this_load += effective_load(tg, this_cpu, -weight, -weight);
 		load += effective_load(tg, prev_cpu, 0, -weight);
 	}
 
 	tg = task_group(p);
-	weight = p->se.load.weight;
+	weight = p->se.avg.load_avg;
 
 	/*
 	 * In low-load situations, where prev_cpu is idle and this_cpu is idle
@@ -4954,12 +4767,12 @@ done:
  * tasks. The unit of the return value must be the one of capacity so we can
  * compare the usage with the capacity of the CPU that is available for CFS
  * task (ie cpu_capacity).
- * cfs.utilization_load_avg is the sum of running time of runnable tasks on a
+ * cfs.avg.util_avg is the sum of running time of runnable tasks on a
  * CPU. It represents the amount of utilization of a CPU in the range
  * [0..SCHED_LOAD_SCALE].  The usage of a CPU can't be higher than the full
  * capacity of the CPU because it's about the running time on this CPU.
- * Nevertheless, cfs.utilization_load_avg can be higher than SCHED_LOAD_SCALE
- * because of unfortunate rounding in avg_period and running_load_avg or just
+ * Nevertheless, cfs.avg.util_avg can be higher than SCHED_LOAD_SCALE
+ * because of unfortunate rounding in util_avg or just
  * after migrating tasks until the average stabilizes with the new running
  * time. So we need to check that the usage stays into the range
  * [0..cpu_capacity_orig] and cap if necessary.
@@ -4968,7 +4781,7 @@ done:
  */
 static int get_cpu_usage(int cpu)
 {
-	unsigned long usage = cpu_rq(cpu)->cfs.utilization_load_avg;
+	unsigned long usage = cpu_rq(cpu)->cfs.avg.util_avg;
 	unsigned long capacity = capacity_orig_of(cpu);
 
 	if (usage >= SCHED_LOAD_SCALE)
@@ -5074,26 +4887,22 @@ unlock:
  * previous cpu.  However, the caller only guarantees p->pi_lock is held; no
  * other assumptions, including the state of rq->lock, should be made.
  */
-static void
-migrate_task_rq_fair(struct task_struct *p, int next_cpu)
+static void migrate_task_rq_fair(struct task_struct *p, int next_cpu)
 {
-	struct sched_entity *se = &p->se;
-	struct cfs_rq *cfs_rq = cfs_rq_of(se);
-
 	/*
-	 * Load tracking: accumulate removed load so that it can be processed
-	 * when we next update owning cfs_rq under rq->lock.  Tasks contribute
-	 * to blocked load iff they have a positive decay-count.  It can never
-	 * be negative here since on-rq tasks have decay-count == 0.
+	 * We are supposed to update the task to "current" time, then its up to date
+	 * and ready to go to new CPU/cfs_rq. But we have difficulty in getting
+	 * what current time is, so simply throw away the out-of-date time. This
+	 * will result in the wakee task is less decayed, but giving the wakee more
+	 * load sounds not bad.
 	 */
-	if (se->avg.decay_count) {
-		se->avg.decay_count = -__synchronize_entity_decay(se);
-		atomic_long_add(se->avg.load_avg_contrib,
-						&cfs_rq->removed_load);
-	}
+	remove_entity_load_avg(&p->se);
+
+	/* Tell new CPU we are migrated */
+	p->se.avg.last_update_time = 0;
 
 	/* We have migrated, no longer consider this task hot */
-	se->exec_start = 0;
+	p->se.exec_start = 0;
 }
 #endif /* CONFIG_SMP */
 
@@ -5977,36 +5786,6 @@ static void attach_tasks(struct lb_env *env)
 }
 
 #ifdef CONFIG_FAIR_GROUP_SCHED
-/*
- * update tg->load_weight by folding this cpu's load_avg
- */
-static void __update_blocked_averages_cpu(struct task_group *tg, int cpu)
-{
-	struct sched_entity *se = tg->se[cpu];
-	struct cfs_rq *cfs_rq = tg->cfs_rq[cpu];
-
-	/* throttled entities do not contribute to load */
-	if (throttled_hierarchy(cfs_rq))
-		return;
-
-	update_cfs_rq_blocked_load(cfs_rq, 1);
-
-	if (se) {
-		update_entity_load_avg(se, 1);
-		/*
-		 * We pivot on our runnable average having decayed to zero for
-		 * list removal.  This generally implies that all our children
-		 * have also been removed (modulo rounding error or bandwidth
-		 * control); however, such cases are rare and we can fix these
-		 * at enqueue.
-		 *
-		 * TODO: fix up out-of-order children on enqueue.
-		 */
-		if (!se->avg.runnable_avg_sum && !cfs_rq->nr_running)
-			list_del_leaf_cfs_rq(cfs_rq);
-	}
-}
-
 static void update_blocked_averages(int cpu)
 {
 	struct rq *rq = cpu_rq(cpu);
@@ -6015,17 +5794,17 @@ static void update_blocked_averages(int cpu)
 
 	raw_spin_lock_irqsave(&rq->lock, flags);
 	update_rq_clock(rq);
+
 	/*
 	 * Iterates the task_group tree in a bottom up fashion, see
 	 * list_add_leaf_cfs_rq() for details.
 	 */
 	for_each_leaf_cfs_rq(rq, cfs_rq) {
-		/*
-		 * Note: We may want to consider periodically releasing
-		 * rq->lock about these updates so that creating many task
-		 * groups does not result in continually extending hold time.
-		 */
-		__update_blocked_averages_cpu(cfs_rq->tg, rq->cpu);
+		/* throttled entities do not contribute to load */
+		if (throttled_hierarchy(cfs_rq))
+			continue;
+
+		update_cfs_rq_load_avg(cfs_rq_clock_task(cfs_rq), cfs_rq);
 	}
 
 	raw_spin_unlock_irqrestore(&rq->lock, flags);
@@ -6055,14 +5834,13 @@ static void update_cfs_rq_h_load(struct cfs_rq *cfs_rq)
 	}
 
 	if (!se) {
-		cfs_rq->h_load = cfs_rq->runnable_load_avg;
+		cfs_rq->h_load = cfs_rq->avg.load_avg;
 		cfs_rq->last_h_load_update = now;
 	}
 
 	while ((se = cfs_rq->h_load_next) != NULL) {
 		load = cfs_rq->h_load;
-		load = div64_ul(load * se->avg.load_avg_contrib,
-				cfs_rq->runnable_load_avg + 1);
+		load = div64_ul(load * se->avg.load_avg, cfs_rq->avg.load_avg + 1);
 		cfs_rq = group_cfs_rq(se);
 		cfs_rq->h_load = load;
 		cfs_rq->last_h_load_update = now;
@@ -6074,8 +5852,8 @@ static unsigned long task_h_load(struct task_struct *p)
 	struct cfs_rq *cfs_rq = task_cfs_rq(p);
 
 	update_cfs_rq_h_load(cfs_rq);
-	return div64_ul(p->se.avg.load_avg_contrib * cfs_rq->h_load,
-			cfs_rq->runnable_load_avg + 1);
+	return div64_ul(p->se.avg.load_avg * cfs_rq->h_load,
+			cfs_rq->avg.load_avg + 1);
 }
 #else
 static inline void update_blocked_averages(int cpu)
@@ -6084,7 +5862,7 @@ static inline void update_blocked_averages(int cpu)
 
 static unsigned long task_h_load(struct task_struct *p)
 {
-	return p->se.avg.load_avg_contrib;
+	return p->se.avg.load_avg;
 }
 #endif
 
@@ -8085,15 +7863,18 @@ static void switched_from_fair(struct rq *rq, struct task_struct *p)
 	}
 
 #ifdef CONFIG_SMP
-	/*
-	* Remove our load from contribution when we leave sched_fair
-	* and ensure we don't carry in an old decay_count if we
-	* switch back.
-	*/
-	if (se->avg.decay_count) {
-		__synchronize_entity_decay(se);
-		subtract_blocked_load_contrib(cfs_rq, se->avg.load_avg_contrib);
-	}
+	/* Catch up with the cfs_rq and remove our load when we leave */
+	__update_load_avg(cfs_rq->avg.last_update_time, cpu_of(rq), &se->avg,
+		se->on_rq * scale_load_down(se->load.weight), cfs_rq->curr == se);
+
+	cfs_rq->avg.load_avg =
+		max_t(long, cfs_rq->avg.load_avg - se->avg.load_avg, 0);
+	cfs_rq->avg.load_sum =
+		max_t(s64, cfs_rq->avg.load_sum - se->avg.load_sum, 0);
+	cfs_rq->avg.util_avg =
+		max_t(long, cfs_rq->avg.util_avg - se->avg.util_avg, 0);
+	cfs_rq->avg.util_sum =
+		max_t(s32, cfs_rq->avg.util_sum - se->avg.util_sum, 0);
 #endif
 }
 
@@ -8150,8 +7931,8 @@ void init_cfs_rq(struct cfs_rq *cfs_rq)
 	cfs_rq->min_vruntime_copy = cfs_rq->min_vruntime;
 #endif
 #ifdef CONFIG_SMP
-	atomic64_set(&cfs_rq->decay_counter, 1);
-	atomic_long_set(&cfs_rq->removed_load, 0);
+	atomic_long_set(&cfs_rq->removed_load_avg, 0);
+	atomic_long_set(&cfs_rq->removed_util_avg, 0);
 #endif
 }
 
@@ -8196,14 +7977,14 @@ static void task_move_group_fair(struct task_struct *p, int queued)
 	if (!queued) {
 		cfs_rq = cfs_rq_of(se);
 		se->vruntime += cfs_rq->min_vruntime;
+
 #ifdef CONFIG_SMP
-		/*
-		 * migrate_task_rq_fair() will have removed our previous
-		 * contribution, but we must synchronize for ongoing future
-		 * decay.
-		 */
-		se->avg.decay_count = atomic64_read(&cfs_rq->decay_counter);
-		cfs_rq->blocked_load_avg += se->avg.load_avg_contrib;
+		/* Virtually synchronize task with its new cfs_rq */
+		p->se.avg.last_update_time = cfs_rq->avg.last_update_time;
+		cfs_rq->avg.load_avg += p->se.avg.load_avg;
+		cfs_rq->avg.load_sum += p->se.avg.load_sum;
+		cfs_rq->avg.util_avg += p->se.avg.util_avg;
+		cfs_rq->avg.util_sum += p->se.avg.util_sum;
 #endif
 	}
 }
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index d465a5c..3dfec8d 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -245,7 +245,6 @@ struct task_group {
 
 #ifdef	CONFIG_SMP
 	atomic_long_t load_avg;
-	atomic_t runnable_avg;
 #endif
 #endif
 
@@ -366,27 +365,18 @@ struct cfs_rq {
 
 #ifdef CONFIG_SMP
 	/*
-	 * CFS Load tracking
-	 * Under CFS, load is tracked on a per-entity basis and aggregated up.
-	 * This allows for the description of both thread and group usage (in
-	 * the FAIR_GROUP_SCHED case).
-	 * runnable_load_avg is the sum of the load_avg_contrib of the
-	 * sched_entities on the rq.
-	 * blocked_load_avg is similar to runnable_load_avg except that its
-	 * the blocked sched_entities on the rq.
-	 * utilization_load_avg is the sum of the average running time of the
-	 * sched_entities on the rq.
+	 * CFS load tracking
 	 */
-	unsigned long runnable_load_avg, blocked_load_avg, utilization_load_avg;
-	atomic64_t decay_counter;
-	u64 last_decay;
-	atomic_long_t removed_load;
-
+	struct sched_avg avg;
 #ifdef CONFIG_FAIR_GROUP_SCHED
-	/* Required to track per-cpu representation of a task_group */
-	u32 tg_runnable_contrib;
-	unsigned long tg_load_contrib;
+	unsigned long tg_load_avg_contrib;
+#endif
+	atomic_long_t removed_load_avg, removed_util_avg;
+#ifndef CONFIG_64BIT
+	u64 load_last_update_time_copy;
+#endif
 
+#ifdef CONFIG_FAIR_GROUP_SCHED
 	/*
 	 *   h_load = weight * f(tg)
 	 *
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v8 2/4] sched: Rewrite runnable load and utilization average tracking
  2015-06-19  6:00   ` Boqun Feng
@ 2015-06-18 23:05     ` Yuyang Du
  2015-06-19  7:57       ` Boqun Feng
  0 siblings, 1 reply; 13+ messages in thread
From: Yuyang Du @ 2015-06-18 23:05 UTC (permalink / raw)
  To: Boqun Feng
  Cc: mingo, peterz, linux-kernel, pjt, bsegall, morten.rasmussen,
	vincent.guittot, dietmar.eggemann, len.brown, rafael.j.wysocki,
	fengguang.wu, srikar

On Fri, Jun 19, 2015 at 02:00:38PM +0800, Boqun Feng wrote:
> However, update_cfs_rq_load_avg() only updates cfs_rq->avg, the change
> won't be contributed or aggregated to cfs_rq's parent in the
> for_each_leaf_cfs_rq loop, therefore that's actually not a bottom-up
> update.
> 
> To fix this, I think we can add a update_cfs_shares(cfs_rq) after
> update_cfs_rq_load_avg(). Like:
> 
>  	for_each_leaf_cfs_rq(rq, cfs_rq) {
> -		/*
> -		 * Note: We may want to consider periodically releasing
> -		 * rq->lock about these updates so that creating many task
> -		 * groups does not result in continually extending hold time.
> -		 */
> -		__update_blocked_averages_cpu(cfs_rq->tg, rq->cpu);
> +		/* throttled entities do not contribute to load */
> +		if (throttled_hierarchy(cfs_rq))
> +			continue;
> +
> +		update_cfs_rq_load_avg(cfs_rq_clock_task(cfs_rq), cfs_rq);
> +		update_cfs_share(cfs_rq);
>  	}
> 
> However, I think update_cfs_share isn't cheap, because it may do a
> bottom-up update once called. So how about just update the root cfs_rq?
> Like:
> 
> -	/*
> -	 * Iterates the task_group tree in a bottom up fashion, see
> -	 * list_add_leaf_cfs_rq() for details.
> -	 */
> -	for_each_leaf_cfs_rq(rq, cfs_rq) {
> -		/*
> -		 * Note: We may want to consider periodically releasing
> -		 * rq->lock about these updates so that creating many task
> -		 * groups does not result in continually extending hold time.
> -		 */
> -		__update_blocked_averages_cpu(cfs_rq->tg, rq->cpu);
> -	}
> +	update_cfs_rq_load_avg(rq_clock_task(rq), rq->cfs_rq);

Hi Boqun,

Did I get you right:

This rewrite patch does not NEED to aggregate entity's load to cfs_rq,
but rather directly update the cfs_rq's load (both runnable and blocked),
so there is NO NEED to iterate all of the cfs_rqs.

So simply updating the top cfs_rq is already equivalent to the stock.

It is better if we iterate the cfs_rq to update the actually weight
(update_cfs_share), because the weight may have already changed, which
would in turn change the load. But update_cfs_share is not cheap.

Right?

Thanks,
Yuyang

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v8 2/4] sched: Rewrite runnable load and utilization average tracking
  2015-06-19  7:57       ` Boqun Feng
@ 2015-06-19  3:11         ` Yuyang Du
  2015-06-19 12:22           ` Boqun Feng
  0 siblings, 1 reply; 13+ messages in thread
From: Yuyang Du @ 2015-06-19  3:11 UTC (permalink / raw)
  To: Boqun Feng
  Cc: mingo, peterz, linux-kernel, pjt, bsegall, morten.rasmussen,
	vincent.guittot, dietmar.eggemann, len.brown, rafael.j.wysocki,
	fengguang.wu, srikar

On Fri, Jun 19, 2015 at 03:57:24PM +0800, Boqun Feng wrote:
> > 
> > This rewrite patch does not NEED to aggregate entity's load to cfs_rq,
> > but rather directly update the cfs_rq's load (both runnable and blocked),
> > so there is NO NEED to iterate all of the cfs_rqs.
> 
> Actually, I'm not sure whether we NEED to aggregate or NOT.
> 
> > 
> > So simply updating the top cfs_rq is already equivalent to the stock.
> > 

Ok. By aggregate, the rewrite patch does not need it, because the cfs_rq's
load is calculated at once with all its runnable and blocked tasks counted,
assuming the all children's weights are up-to-date, of course. Please refer
to the changelog to get an idea.

> 
> The stock does have a bottom up update, so simply updating the top
> cfs_rq is not equivalent to it. Simply updateing the top cfs_rq is
> equivalent to the rewrite patch, because the rewrite patch lacks of the
> aggregation.

It is not the rewrite patch "lacks" aggregation, it is needless. The stock
has to do a bottom-up update and aggregate, because 1) it updates the
load at an entity granularity, 2) the blocked load is separate.

> > It is better if we iterate the cfs_rq to update the actually weight
> > (update_cfs_share), because the weight may have already changed, which
> > would in turn change the load. But update_cfs_share is not cheap.
> > 
> > Right?
> 
> You get me right for most part ;-)
> 
> My points are:
> 
> 1. We *may not* need to aggregate entity's load to cfs_rq in
> update_blocked_averages(), simply updating the top cfs_rq may be just
> fine, but I'm not sure, so scheduler experts' insights are needed here.
 
Then I don't need to say anything about this.

> 2. Whether we need to aggregate or not, the update_blocked_averages() in
> the rewrite patch could be improved. If we need to aggregate, we have to
> add something like update_cfs_shares(). If we don't need, we can just
> replace the loop with one update_cfs_rq_load_avg() on root cfs_rq.
 
If update_cfs_shares() is done here, it is good, but probably not necessary
though. However, we do need to update_tg_load_avg() here, because if cfs_rq's
load change, the parent tg's load_avg should change too. I will upload a next
version soon.

In addition, an update to the stress + dbench test case:

I have a Core i7, not a Xeon Nehalem, and I have a patch that may not impact
the result. Then, the dbench runs at very low CPU utilization ~1%. Boqun said
this may result from cgroup control, the dbench I/O is low.

Anyway, I can't reproduce the results, the CPU0's util is 92+%, and other CPUs
have ~100% util.

Thanks,
Yuyang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v8 2/4] sched: Rewrite runnable load and utilization average tracking
  2015-06-15 19:26 ` [PATCH v8 2/4] " Yuyang Du
@ 2015-06-19  6:00   ` Boqun Feng
  2015-06-18 23:05     ` Yuyang Du
  0 siblings, 1 reply; 13+ messages in thread
From: Boqun Feng @ 2015-06-19  6:00 UTC (permalink / raw)
  To: Yuyang Du
  Cc: mingo, peterz, linux-kernel, pjt, bsegall, morten.rasmussen,
	vincent.guittot, dietmar.eggemann, arjan.van.de.ven, len.brown,
	rafael.j.wysocki, fengguang.wu, srikar

[-- Attachment #1: Type: text/plain, Size: 4296 bytes --]

Hi Yuyang,

On Tue, Jun 16, 2015 at 03:26:05AM +0800, Yuyang Du wrote:
> @@ -5977,36 +5786,6 @@ static void attach_tasks(struct lb_env *env)
>  }
>  
>  #ifdef CONFIG_FAIR_GROUP_SCHED
> -/*
> - * update tg->load_weight by folding this cpu's load_avg
> - */
> -static void __update_blocked_averages_cpu(struct task_group *tg, int cpu)
> -{
> -	struct sched_entity *se = tg->se[cpu];
> -	struct cfs_rq *cfs_rq = tg->cfs_rq[cpu];
> -
> -	/* throttled entities do not contribute to load */
> -	if (throttled_hierarchy(cfs_rq))
> -		return;
> -
> -	update_cfs_rq_blocked_load(cfs_rq, 1);
> -
> -	if (se) {
> -		update_entity_load_avg(se, 1);
> -		/*
> -		 * We pivot on our runnable average having decayed to zero for
> -		 * list removal.  This generally implies that all our children
> -		 * have also been removed (modulo rounding error or bandwidth
> -		 * control); however, such cases are rare and we can fix these
> -		 * at enqueue.
> -		 *
> -		 * TODO: fix up out-of-order children on enqueue.
> -		 */
> -		if (!se->avg.runnable_avg_sum && !cfs_rq->nr_running)
> -			list_del_leaf_cfs_rq(cfs_rq);
> -	}
> -}
> -
>  static void update_blocked_averages(int cpu)
>  {
>  	struct rq *rq = cpu_rq(cpu);
> @@ -6015,17 +5794,17 @@ static void update_blocked_averages(int cpu)
>  
>  	raw_spin_lock_irqsave(&rq->lock, flags);
>  	update_rq_clock(rq);
> +
>  	/*
>  	 * Iterates the task_group tree in a bottom up fashion, see
>  	 * list_add_leaf_cfs_rq() for details.
>  	 */
>  	for_each_leaf_cfs_rq(rq, cfs_rq) {
> -		/*
> -		 * Note: We may want to consider periodically releasing
> -		 * rq->lock about these updates so that creating many task
> -		 * groups does not result in continually extending hold time.
> -		 */
> -		__update_blocked_averages_cpu(cfs_rq->tg, rq->cpu);
> +		/* throttled entities do not contribute to load */
> +		if (throttled_hierarchy(cfs_rq))
> +			continue;
> +
> +		update_cfs_rq_load_avg(cfs_rq_clock_task(cfs_rq), cfs_rq);

We iterates task_group tree(actually the corresponding cfs_rq tree on
that cpu), because we want to do a bottom-up update. And we want a
bottom-up update, because we want to update the weigthed_cpuload().

__update_blocked_averages_cpu(tg, cpu) does three things:

Let's say:
cfs_rq = tg->cfs_rq[cpu]
se = tg->cfs_rq[cpu]
pcfs_rq = cfs_rq_of(se)  ,which is the parent of cfs_rq.

1. update cfs_rq->blocked_load_avg, and its contrib to its task group.
2. update se->avg and calculate the deltas of se->avg.*_avg_contrib.
3. update pcfs_rq->*_load_avg with the deltas in step 2

In this way, __update_blocked_averages_cpu(tg, cpu) can contributes
tg's load changes to its parent. So that update_blocked_averages() can
aggregate all load changes to weighted_cpuload().

However, update_cfs_rq_load_avg() only updates cfs_rq->avg, the change
won't be contributed or aggregated to cfs_rq's parent in the
for_each_leaf_cfs_rq loop, therefore that's actually not a bottom-up
update.

To fix this, I think we can add a update_cfs_shares(cfs_rq) after
update_cfs_rq_load_avg(). Like:

 	for_each_leaf_cfs_rq(rq, cfs_rq) {
-		/*
-		 * Note: We may want to consider periodically releasing
-		 * rq->lock about these updates so that creating many task
-		 * groups does not result in continually extending hold time.
-		 */
-		__update_blocked_averages_cpu(cfs_rq->tg, rq->cpu);
+		/* throttled entities do not contribute to load */
+		if (throttled_hierarchy(cfs_rq))
+			continue;
+
+		update_cfs_rq_load_avg(cfs_rq_clock_task(cfs_rq), cfs_rq);
+		update_cfs_share(cfs_rq);
 	}

However, I think update_cfs_share isn't cheap, because it may do a
bottom-up update once called. So how about just update the root cfs_rq?
Like:

-	/*
-	 * Iterates the task_group tree in a bottom up fashion, see
-	 * list_add_leaf_cfs_rq() for details.
-	 */
-	for_each_leaf_cfs_rq(rq, cfs_rq) {
-		/*
-		 * Note: We may want to consider periodically releasing
-		 * rq->lock about these updates so that creating many task
-		 * groups does not result in continually extending hold time.
-		 */
-		__update_blocked_averages_cpu(cfs_rq->tg, rq->cpu);
-	}
+	update_cfs_rq_load_avg(rq_clock_task(rq), rq->cfs_rq);

Thanks and Best Regards,
Boqun

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v8 2/4] sched: Rewrite runnable load and utilization average tracking
  2015-06-18 23:05     ` Yuyang Du
@ 2015-06-19  7:57       ` Boqun Feng
  2015-06-19  3:11         ` Yuyang Du
  0 siblings, 1 reply; 13+ messages in thread
From: Boqun Feng @ 2015-06-19  7:57 UTC (permalink / raw)
  To: Yuyang Du
  Cc: mingo, peterz, linux-kernel, pjt, bsegall, morten.rasmussen,
	vincent.guittot, dietmar.eggemann, len.brown, rafael.j.wysocki,
	fengguang.wu, srikar

[-- Attachment #1: Type: text/plain, Size: 3354 bytes --]

Hi Yuyang,

On Fri, Jun 19, 2015 at 07:05:54AM +0800, Yuyang Du wrote:
> On Fri, Jun 19, 2015 at 02:00:38PM +0800, Boqun Feng wrote:
> > However, update_cfs_rq_load_avg() only updates cfs_rq->avg, the change
> > won't be contributed or aggregated to cfs_rq's parent in the
> > for_each_leaf_cfs_rq loop, therefore that's actually not a bottom-up
> > update.
> > 
> > To fix this, I think we can add a update_cfs_shares(cfs_rq) after
> > update_cfs_rq_load_avg(). Like:
> > 
> >  	for_each_leaf_cfs_rq(rq, cfs_rq) {
> > -		/*
> > -		 * Note: We may want to consider periodically releasing
> > -		 * rq->lock about these updates so that creating many task
> > -		 * groups does not result in continually extending hold time.
> > -		 */
> > -		__update_blocked_averages_cpu(cfs_rq->tg, rq->cpu);
> > +		/* throttled entities do not contribute to load */
> > +		if (throttled_hierarchy(cfs_rq))
> > +			continue;
> > +
> > +		update_cfs_rq_load_avg(cfs_rq_clock_task(cfs_rq), cfs_rq);
> > +		update_cfs_share(cfs_rq);
> >  	}
> > 
> > However, I think update_cfs_share isn't cheap, because it may do a
> > bottom-up update once called. So how about just update the root cfs_rq?
> > Like:
> > 
> > -	/*
> > -	 * Iterates the task_group tree in a bottom up fashion, see
> > -	 * list_add_leaf_cfs_rq() for details.
> > -	 */
> > -	for_each_leaf_cfs_rq(rq, cfs_rq) {
> > -		/*
> > -		 * Note: We may want to consider periodically releasing
> > -		 * rq->lock about these updates so that creating many task
> > -		 * groups does not result in continually extending hold time.
> > -		 */
> > -		__update_blocked_averages_cpu(cfs_rq->tg, rq->cpu);
> > -	}
> > +	update_cfs_rq_load_avg(rq_clock_task(rq), rq->cfs_rq);
> 
> Hi Boqun,
> 
> Did I get you right:
> 
> This rewrite patch does not NEED to aggregate entity's load to cfs_rq,
> but rather directly update the cfs_rq's load (both runnable and blocked),
> so there is NO NEED to iterate all of the cfs_rqs.

Actually, I'm not sure whether we NEED to aggregate or NOT.

> 
> So simply updating the top cfs_rq is already equivalent to the stock.
> 

The stock does have a bottom up update, so simply updating the top
cfs_rq is not equivalent to it. Simply updateing the top cfs_rq is
equivalent to the rewrite patch, because the rewrite patch lacks of the
aggregation.

> It is better if we iterate the cfs_rq to update the actually weight
> (update_cfs_share), because the weight may have already changed, which
> would in turn change the load. But update_cfs_share is not cheap.
> 
> Right?

You get me right for most part ;-)

My points are:

1. We *may not* need to aggregate entity's load to cfs_rq in
update_blocked_averages(), simply updating the top cfs_rq may be just
fine, but I'm not sure, so scheduler experts' insights are needed here.

2. Whether we need to aggregate or not, the update_blocked_averages() in
the rewrite patch could be improved. If we need to aggregate, we have to
add something like update_cfs_shares(). If we don't need, we can just
replace the loop with one update_cfs_rq_load_avg() on root cfs_rq.

I think we'd better to figure out the "may not" part in point 1 first to
get a reasonable implemenation of update_blocked_averages().

Is that clear now?

Thanks and Best Regards,
Boqun

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v8 2/4] sched: Rewrite runnable load and utilization average tracking
  2015-06-19  3:11         ` Yuyang Du
@ 2015-06-19 12:22           ` Boqun Feng
  2015-06-21 22:43             ` Yuyang Du
  0 siblings, 1 reply; 13+ messages in thread
From: Boqun Feng @ 2015-06-19 12:22 UTC (permalink / raw)
  To: Yuyang Du
  Cc: mingo, peterz, linux-kernel, pjt, bsegall, morten.rasmussen,
	vincent.guittot, dietmar.eggemann, len.brown, rafael.j.wysocki,
	fengguang.wu, srikar

[-- Attachment #1: Type: text/plain, Size: 3839 bytes --]

Hi Yuyang,

On Fri, Jun 19, 2015 at 11:11:16AM +0800, Yuyang Du wrote:
> On Fri, Jun 19, 2015 at 03:57:24PM +0800, Boqun Feng wrote:
> > > 
> > > This rewrite patch does not NEED to aggregate entity's load to cfs_rq,
> > > but rather directly update the cfs_rq's load (both runnable and blocked),
> > > so there is NO NEED to iterate all of the cfs_rqs.
> > 
> > Actually, I'm not sure whether we NEED to aggregate or NOT.
> > 
> > > 
> > > So simply updating the top cfs_rq is already equivalent to the stock.
> > > 
> 
> Ok. By aggregate, the rewrite patch does not need it, because the cfs_rq's
> load is calculated at once with all its runnable and blocked tasks counted,
> assuming the all children's weights are up-to-date, of course. Please refer
> to the changelog to get an idea.
> 
> > 
> > The stock does have a bottom up update, so simply updating the top
> > cfs_rq is not equivalent to it. Simply updateing the top cfs_rq is
> > equivalent to the rewrite patch, because the rewrite patch lacks of the
> > aggregation.
> 
> It is not the rewrite patch "lacks" aggregation, it is needless. The stock
> has to do a bottom-up update and aggregate, because 1) it updates the
> load at an entity granularity, 2) the blocked load is separate.

Yep, you are right, the aggregation is not necessary.

Let me see if I understand you, in the rewrite, when we
update_cfs_rq_load_avg() we need neither to aggregate child's load_avg,
nor to update cfs_rq->load.weight. Because:

1) For the load before cfs_rq->last_update_time, it's already in the
->load_avg, and decay will do the job.
2) For the load from cfs_rq->last_update_time to now, we calculate
with cfs_rq->load.weight, and the weight should be weight at
->last_update_time rather than now.

Right?

> 
> > > It is better if we iterate the cfs_rq to update the actually weight
> > > (update_cfs_share), because the weight may have already changed, which
> > > would in turn change the load. But update_cfs_share is not cheap.
> > > 
> > > Right?
> > 
> > You get me right for most part ;-)
> > 
> > My points are:
> > 
> > 1. We *may not* need to aggregate entity's load to cfs_rq in
> > update_blocked_averages(), simply updating the top cfs_rq may be just
> > fine, but I'm not sure, so scheduler experts' insights are needed here.
>  
> Then I don't need to say anything about this.
> 
> > 2. Whether we need to aggregate or not, the update_blocked_averages() in
> > the rewrite patch could be improved. If we need to aggregate, we have to
> > add something like update_cfs_shares(). If we don't need, we can just
> > replace the loop with one update_cfs_rq_load_avg() on root cfs_rq.
>  
> If update_cfs_shares() is done here, it is good, but probably not necessary
> though. However, we do need to update_tg_load_avg() here, because if cfs_rq's

We may have another problem even we udpate_tg_load_avg(), because after
the loop, for each cfs_rq, ->load.weight is not up-to-date, right? So
next time before we update_cfs_rq_load_avg(), we need to guarantee that
the cfs_rq->load.weight is already updated, right? And IMO, we don't
have that guarantee yet, do we?

> load change, the parent tg's load_avg should change too. I will upload a next
> version soon.
> 
> In addition, an update to the stress + dbench test case:
> 
> I have a Core i7, not a Xeon Nehalem, and I have a patch that may not impact
> the result. Then, the dbench runs at very low CPU utilization ~1%. Boqun said
> this may result from cgroup control, the dbench I/O is low.
> 
> Anyway, I can't reproduce the results, the CPU0's util is 92+%, and other CPUs
> have ~100% util.

Thank you for looking into that problem, and I will test with your new
version of patch ;-)

Thanks,
Boqun

> 
> Thanks,
> Yuyang

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v8 2/4] sched: Rewrite runnable load and utilization average tracking
  2015-06-19 12:22           ` Boqun Feng
@ 2015-06-21 22:43             ` Yuyang Du
  0 siblings, 0 replies; 13+ messages in thread
From: Yuyang Du @ 2015-06-21 22:43 UTC (permalink / raw)
  To: Boqun Feng
  Cc: mingo, peterz, linux-kernel, pjt, bsegall, morten.rasmussen,
	vincent.guittot, dietmar.eggemann, len.brown, rafael.j.wysocki,
	fengguang.wu, srikar

On Fri, Jun 19, 2015 at 08:22:07PM +0800, Boqun Feng wrote:
> > It is not the rewrite patch "lacks" aggregation, it is needless. The stock
> > has to do a bottom-up update and aggregate, because 1) it updates the
> > load at an entity granularity, 2) the blocked load is separate.
> 
> Yep, you are right, the aggregation is not necessary.
> 
> Let me see if I understand you, in the rewrite, when we
> update_cfs_rq_load_avg() we need neither to aggregate child's load_avg,
> nor to update cfs_rq->load.weight. Because:
> 
> 1) For the load before cfs_rq->last_update_time, it's already in the
> ->load_avg, and decay will do the job.
> 2) For the load from cfs_rq->last_update_time to now, we calculate
> with cfs_rq->load.weight, and the weight should be weight at
> ->last_update_time rather than now.
> 
> Right?
 
Yes.

> > If update_cfs_shares() is done here, it is good, but probably not necessary
> > though. However, we do need to update_tg_load_avg() here, because if cfs_rq's
> 
> We may have another problem even we udpate_tg_load_avg(), because after
> the loop, for each cfs_rq, ->load.weight is not up-to-date, right? So
> next time before we update_cfs_rq_load_avg(), we need to guarantee that
> the cfs_rq->load.weight is already updated, right? And IMO, we don't
> have that guarantee yet, do we?

If we update weight, we must update load_avg. But if we update load_avg, we may need
to update weight. Yes, your comment here is valid, but we already update the shares
as needed in the cases when they are "active", update_blocked_averages() is
largely for inactive group entities, so we should be fine here.
 
> > load change, the parent tg's load_avg should change too. I will upload a next
> > version soon.
> > 
> > In addition, an update to the stress + dbench test case:
> > 
> > I have a Core i7, not a Xeon Nehalem, and I have a patch that may not impact
> > the result. Then, the dbench runs at very low CPU utilization ~1%. Boqun said
> > this may result from cgroup control, the dbench I/O is low.
> > 
> > Anyway, I can't reproduce the results, the CPU0's util is 92+%, and other CPUs
> > have ~100% util.
> 
> Thank you for looking into that problem, and I will test with your new
> version of patch ;-)
 
That would be good. I played the dbench "as is", and its output looks pretty fine.

Thanks,
Yuyang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-06-22  6:35 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1432518587-114210-1-git-send-email-yuyang.du@intel.com>
     [not found] ` <1432518587-114210-3-git-send-email-yuyang.du@intel.com>
2015-05-26 16:06   ` [PATCH v8 2/4] sched: Rewrite runnable load and utilization average tracking Vincent Guittot
2015-05-27 22:36     ` Yuyang Du
2015-06-02  0:25 ` [PATCH v8 0/4] " Yuyang Du
2015-06-09  1:21 ` Yuyang Du
2015-06-15 10:38   ` Boqun Feng
2015-06-15 18:46     ` Yuyang Du
2015-06-15 19:26 [Resend PATCH " Yuyang Du
2015-06-15 19:26 ` [PATCH v8 2/4] " Yuyang Du
2015-06-19  6:00   ` Boqun Feng
2015-06-18 23:05     ` Yuyang Du
2015-06-19  7:57       ` Boqun Feng
2015-06-19  3:11         ` Yuyang Du
2015-06-19 12:22           ` Boqun Feng
2015-06-21 22:43             ` Yuyang Du

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).