FYI, we noticed hackbench.throughput -32.9% regression due to commit: commit 53d3bc773eaa7ab1cf63585e76af7ee869d5e709 ("Revert "sched/fair: Fix fairness issue on migration"") https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master in testcase: hackbench on test machine: ivb42: 48 threads Ivytown Ivy Bridge-EP with 64G memory with following parameters: cpufreq_governor=performance/ipc=socket/mode=threads/nr_threads=50% In addition to that, the commit also has significant impact on the following tests: unixbench: unixbench.score 25.9% improvement on test machine - ivb42: 48 threads Ivytown Ivy Bridge-EP with 64G memory with test parameters: cpufreq_governor=performance/nr_task=100%/test=context1 hackbench: hackbench.throughput -15.6% regression on test machine - lkp-hsw-ep4: 72 threads Haswell-EP with 128G memory with test parameters: cpufreq_governor=performance/ipc=pipe/iterations=12/mode=process/nr_threads=50% Details are as below: --------------------------------------------------------------------------------------------------> ========================================================================================= compiler/cpufreq_governor/ipc/kconfig/mode/nr_threads/rootfs/tbox_group/testcase: gcc-4.9/performance/socket/x86_64-rhel/threads/50%/debian-x86_64-2015-02-07.cgz/ivb42/hackbench commit: c5114626f33b62fa7595e57d87f33d9d1f8298a2 53d3bc773eaa7ab1cf63585e76af7ee869d5e709 c5114626f33b62fa 53d3bc773eaa7ab1cf63585e76 ---------------- -------------------------- %stddev %change %stddev \ | \ 196590 ± 0% -32.9% 131963 ± 2% hackbench.throughput 602.66 ± 0% +2.8% 619.27 ± 2% hackbench.time.elapsed_time 602.66 ± 0% +2.8% 619.27 ± 2% hackbench.time.elapsed_time.max 1.76e+08 ± 3% +236.0% 5.914e+08 ± 2% hackbench.time.involuntary_context_switches 208664 ± 2% +26.0% 262929 ± 3% hackbench.time.minor_page_faults 4401 ± 0% +5.7% 4650 ± 0% hackbench.time.percent_of_cpu_this_job_got 25256 ± 0% +10.2% 27842 ± 2% hackbench.time.system_time 1272 ± 0% -24.5% 961.37 ± 2% hackbench.time.user_time 7.64e+08 ± 1% +131.8% 1.771e+09 ± 2% hackbench.time.voluntary_context_switches 143370 ± 0% -12.0% 126124 ± 1% meminfo.SUnreclaim 2462880 ± 0% -35.6% 1585869 ± 5% softirqs.SCHED 4051 ± 0% -39.9% 2434 ± 3% uptime.idle 1766752 ± 1% +122.6% 3932589 ± 1% vmstat.system.cs 249718 ± 2% +307.4% 1017398 ± 3% vmstat.system.in 1.76e+08 ± 3% +236.0% 5.914e+08 ± 2% time.involuntary_context_switches 208664 ± 2% +26.0% 262929 ± 3% time.minor_page_faults 1272 ± 0% -24.5% 961.37 ± 2% time.user_time 7.64e+08 ± 1% +131.8% 1.771e+09 ± 2% time.voluntary_context_switches 2228 ± 92% +137.1% 5285 ± 15% numa-meminfo.node0.AnonHugePages 73589 ± 4% -12.5% 64393 ± 2% numa-meminfo.node0.SUnreclaim 27438 ± 83% +102.6% 55585 ± 6% numa-meminfo.node0.Shmem 101051 ± 3% -10.9% 90044 ± 2% numa-meminfo.node0.Slab 69844 ± 4% -11.8% 61579 ± 3% numa-meminfo.node1.SUnreclaim 1136461 ± 3% +16.6% 1324662 ± 5% numa-numastat.node0.local_node 1140216 ± 3% +16.2% 1324689 ± 5% numa-numastat.node0.numa_hit 3755 ± 68% -99.3% 27.25 ± 94% numa-numastat.node0.other_node 1098889 ± 4% +20.1% 1320211 ± 6% numa-numastat.node1.local_node 1101996 ± 4% +20.5% 1327590 ± 6% numa-numastat.node1.numa_hit 7.18 ± 0% -50.2% 3.57 ± 43% perf-profile.cycles-pp.call_cpuidle 8.09 ± 0% -44.7% 4.47 ± 38% perf-profile.cycles-pp.cpu_startup_entry 7.17 ± 0% -50.3% 3.56 ± 43% perf-profile.cycles-pp.cpuidle_enter 7.14 ± 0% -50.3% 3.55 ± 43% perf-profile.cycles-pp.cpuidle_enter_state 7.11 ± 0% -50.6% 3.52 ± 43% perf-profile.cycles-pp.intel_idle 8.00 ± 0% -44.5% 4.44 ± 38% perf-profile.cycles-pp.start_secondary 92.32 ± 0% +5.4% 97.32 ± 0% turbostat.%Busy 2763 ± 0% +5.4% 2912 ± 0% turbostat.Avg_MHz 7.48 ± 0% -66.5% 2.50 ± 7% turbostat.CPU%c1 0.20 ± 2% -6.4% 0.18 ± 2% turbostat.CPU%c6 180.03 ± 0% -1.3% 177.62 ± 0% turbostat.CorWatt 5.83 ± 0% +38.9% 8.10 ± 3% turbostat.RAMWatt 6857 ± 83% +102.8% 13905 ± 6% numa-vmstat.node0.nr_shmem 18395 ± 4% -12.4% 16121 ± 2% numa-vmstat.node0.nr_slab_unreclaimable 675569 ± 3% +12.7% 761135 ± 4% numa-vmstat.node0.numa_local 71537 ± 5% -7.9% 65920 ± 2% numa-vmstat.node0.numa_other 17456 ± 4% -11.7% 15405 ± 3% numa-vmstat.node1.nr_slab_unreclaimable 695848 ± 3% +14.9% 799683 ± 5% numa-vmstat.node1.numa_hit 677405 ± 4% +14.5% 775903 ± 6% numa-vmstat.node1.numa_local 18442 ± 19% +28.9% 23779 ± 5% numa-vmstat.node1.numa_other 1.658e+09 ± 0% -59.1% 6.784e+08 ± 7% cpuidle.C1-IVT.time 1.066e+08 ± 0% -40.3% 63661563 ± 6% cpuidle.C1-IVT.usage 26348635 ± 0% -86.8% 3471048 ± 15% cpuidle.C1E-IVT.time 291620 ± 0% -85.1% 43352 ± 15% cpuidle.C1E-IVT.usage 54158643 ± 1% -88.5% 6254009 ± 14% cpuidle.C3-IVT.time 482437 ± 1% -87.0% 62620 ± 16% cpuidle.C3-IVT.usage 5.028e+08 ± 0% -75.8% 1.219e+08 ± 8% cpuidle.C6-IVT.time 3805026 ± 0% -85.5% 552326 ± 16% cpuidle.C6-IVT.usage 2766 ± 4% -51.4% 1344 ± 6% cpuidle.POLL.usage 35841 ± 0% -12.0% 31543 ± 0% proc-vmstat.nr_slab_unreclaimable 154090 ± 2% +43.1% 220509 ± 3% proc-vmstat.numa_hint_faults 129240 ± 2% +47.4% 190543 ± 3% proc-vmstat.numa_hint_faults_local 2238386 ± 1% +18.4% 2649737 ± 2% proc-vmstat.numa_hit 2232163 ± 1% +18.4% 2643105 ± 2% proc-vmstat.numa_local 22315 ± 1% -21.0% 17625 ± 5% proc-vmstat.numa_pages_migrated 154533 ± 2% +45.6% 225071 ± 3% proc-vmstat.numa_pte_updates 382980 ± 2% +33.2% 510157 ± 4% proc-vmstat.pgalloc_dma32 7311738 ± 2% +37.2% 10029060 ± 2% proc-vmstat.pgalloc_normal 7672040 ± 2% +37.1% 10519738 ± 2% proc-vmstat.pgfree 22315 ± 1% -21.0% 17625 ± 5% proc-vmstat.pgmigrate_success 5487 ± 6% -12.6% 4797 ± 4% slabinfo.UNIX.active_objs 5609 ± 5% -12.2% 4926 ± 4% slabinfo.UNIX.num_objs 4362 ± 4% +14.6% 4998 ± 2% slabinfo.cred_jar.active_objs 4362 ± 4% +14.6% 4998 ± 2% slabinfo.cred_jar.num_objs 42525 ± 0% -41.6% 24824 ± 3% slabinfo.kmalloc-256.active_objs 845.50 ± 0% -42.9% 482.50 ± 3% slabinfo.kmalloc-256.active_slabs 54124 ± 0% -42.9% 30920 ± 3% slabinfo.kmalloc-256.num_objs 845.50 ± 0% -42.9% 482.50 ± 3% slabinfo.kmalloc-256.num_slabs 47204 ± 0% -37.9% 29335 ± 2% slabinfo.kmalloc-512.active_objs 915.25 ± 0% -39.8% 551.00 ± 3% slabinfo.kmalloc-512.active_slabs 58599 ± 0% -39.8% 35300 ± 3% slabinfo.kmalloc-512.num_objs 915.25 ± 0% -39.8% 551.00 ± 3% slabinfo.kmalloc-512.num_slabs 12443 ± 2% -20.1% 9944 ± 3% slabinfo.pid.active_objs 12443 ± 2% -20.1% 9944 ± 3% slabinfo.pid.num_objs 440.00 ± 5% -32.8% 295.75 ± 4% slabinfo.taskstats.active_objs 440.00 ± 5% -32.8% 295.75 ± 4% slabinfo.taskstats.num_objs 312.45 ±157% -94.8% 16.29 ± 33% sched_debug.cfs_rq:/.load.stddev 0.27 ± 5% -56.3% 0.12 ± 30% sched_debug.cfs_rq:/.nr_running.stddev 16.51 ± 1% +9.5% 18.08 ± 3% sched_debug.cfs_rq:/.runnable_load_avg.avg 0.05 ±100% +7950.0% 3.66 ± 48% sched_debug.cfs_rq:/.runnable_load_avg.min -740916 ±-28% -158.5% 433310 ±120% sched_debug.cfs_rq:/.spread0.avg 1009940 ± 19% +75.8% 1775442 ± 30% sched_debug.cfs_rq:/.spread0.max -2384171 ± -7% -65.7% -818684 ±-76% sched_debug.cfs_rq:/.spread0.min 749.14 ± 1% +13.0% 846.34 ± 1% sched_debug.cfs_rq:/.util_avg.min 51.66 ± 4% -36.3% 32.92 ± 5% sched_debug.cfs_rq:/.util_avg.stddev 161202 ± 7% -41.7% 93997 ± 4% sched_debug.cpu.avg_idle.avg 595158 ± 6% -51.2% 290491 ± 22% sched_debug.cpu.avg_idle.max 132760 ± 8% -58.8% 54718 ± 19% sched_debug.cpu.avg_idle.stddev 11.40 ± 11% +111.0% 24.05 ± 16% sched_debug.cpu.clock.stddev 11.40 ± 11% +111.0% 24.05 ± 16% sched_debug.cpu.clock_task.stddev 32.34 ± 2% +23.9% 40.07 ± 19% sched_debug.cpu.cpu_load[0].max 0.34 ±103% +520.0% 2.11 ± 67% sched_debug.cpu.cpu_load[0].min 32.18 ± 2% +22.7% 39.50 ± 17% sched_debug.cpu.cpu_load[1].max 3.32 ± 8% +84.9% 6.14 ± 12% sched_debug.cpu.cpu_load[1].min 5.39 ± 7% +36.3% 7.34 ± 4% sched_debug.cpu.cpu_load[2].min 33.18 ± 3% +14.0% 37.82 ± 5% sched_debug.cpu.cpu_load[4].max 5.56 ± 6% +16.2% 6.45 ± 6% sched_debug.cpu.cpu_load[4].stddev 16741 ± 0% -15.4% 14166 ± 2% sched_debug.cpu.curr->pid.avg 19196 ± 0% -18.3% 15690 ± 1% sched_debug.cpu.curr->pid.max 5174 ± 5% -55.4% 2305 ± 14% sched_debug.cpu.curr->pid.stddev 1410 ± 1% -14.2% 1210 ± 6% sched_debug.cpu.nr_load_updates.stddev 9.95 ± 3% -14.5% 8.51 ± 5% sched_debug.cpu.nr_running.avg 29.07 ± 2% -15.0% 24.70 ± 4% sched_debug.cpu.nr_running.max 0.05 ±100% +850.0% 0.43 ± 37% sched_debug.cpu.nr_running.min 7.64 ± 3% -23.0% 5.88 ± 2% sched_debug.cpu.nr_running.stddev 10979930 ± 1% +123.3% 24518490 ± 2% sched_debug.cpu.nr_switches.avg 12350130 ± 1% +117.5% 26856375 ± 2% sched_debug.cpu.nr_switches.max 9594835 ± 2% +132.6% 22314436 ± 2% sched_debug.cpu.nr_switches.min 769296 ± 1% +56.8% 1206190 ± 3% sched_debug.cpu.nr_switches.stddev 8.30 ± 18% +32.9% 11.02 ± 15% sched_debug.cpu.nr_uninterruptible.max turbostat.Avg_MHz 3000 O+---O-O-O--O-O-O-O--O-O-O--O-O-O--O-O-O-O--O------------------------+ *.O..*.*.* *.*.*..*.*.*..*.*.*..*.*.*.*..*.*.*..*.*.*..*.*.*.*..*.* 2500 ++ : : | | : : | | : : | 2000 ++ : : | | : : | 1500 ++ : : | | : : | 1000 ++ : : | | : : | | :: | 500 ++ : | | : | 0 ++----------*--------------------------------------------------------+ turbostat._Busy 100 O+O--O-O-O--O-O-O--O-O-O--O-O-O--O-O-O--O-O-O-------------------------+ 90 *+*..*.*.* *.*..*.*.*..*.*.*..*.*.*..*.*.*..*.*.*..*.*.*..*.*.*..*.* | : : | 80 ++ : : | 70 ++ : : | | : : | 60 ++ : : | 50 ++ : : | 40 ++ : : | | : : | 30 ++ : : | 20 ++ :: | | : | 10 ++ : | 0 ++----------*---------------------------------------------------------+ turbostat.CPU_c1 8 ++---------------*------------------------------------------------------+ *.*..*.*..* *. *..*.*.*..*.*..*.*.*..*.*..*.*.*..*.*..*.*.*..*.*..*.* 7 ++ : : | 6 ++ : : | | : : | 5 ++ : : | | : : | 4 ++ : : | | O : : | 3 O+ O O: : O O | 2 ++ O :O:O O O O O O O O O O O | | : O | 1 ++ : | | : | 0 ++----------*-----------------------------------------------------------+ turbostat.PkgWatt 250 ++--------------------------------------------------------------------+ | | O.O..O.O.O O O.O..O.O.O..O.O.O..O.O.O..O.O.O..*.*.*..*.*.*..*.*.*..*.* 200 ++ : : | | : : | | : : | 150 ++ : : | | : : | 100 ++ : : | | : : | | : : | 50 ++ : : | | : | | : | 0 ++----------*---------------------------------------------------------+ turbostat.CorWatt 200 ++--------------------------------------------------------------------+ 180 *+*..*.*.* *.*..*.*.*..*.*.*..*.*.*..*.*.*..*.*.*..*.*.*..*.*.*..*.* O O O O O O O O O O O O O O O O O O O O | 160 ++ : : | 140 ++ : : | | : : | 120 ++ : : | 100 ++ : : | 80 ++ : : | | : : | 60 ++ : : | 40 ++ :: | | : | 20 ++ : | 0 ++----------*---------------------------------------------------------+ turbostat.RAMWatt 9 ++---------------O------------------------------------------------------+ | O O O O O O O O O O O | 8 ++O O O O O O O | 7 O+ | | | 6 *+*..*.*..* *..*.*..*.*.*..*.*..*.*.*..*.*..*.*.*..*.*..*.*.*..*.*..*.* 5 ++ : : | | : : | 4 ++ : : | 3 ++ : : | | : : | 2 ++ : : | 1 ++ : | | : | 0 ++----------*-----------------------------------------------------------+ hackbench.throughput 200000 *+*-*--*-*---*--*-*-*-*--*-*-*-*--*-*-*-*-*--*-*-*-*--*-*-*-*--*-*-* 180000 ++ : : | | : : | 160000 ++ : : | 140000 O+O : : O O | | O O O:O:O O O O O O O O O O O O | 120000 ++ : : | 100000 ++ : : | 80000 ++ : : | | : : | 60000 ++ : : | 40000 ++ : | | : | 20000 ++ : | 0 ++---------*-------------------------------------------------------+ time.user_time 1400 ++------------*------------------------------------------------------+ *.*..*.*.* : *.*..*.*.*..*.*.*..*.*.*.*..*.*.*..*.*.*..*.*.*.*..*.* 1200 ++ : : | | : : | 1000 O+O O: O:O O O O O O O O O O | | O O : : O O O O | 800 ++ : : | | : : | 600 ++ : : | | : : | 400 ++ : : | | :: | 200 ++ : | | : | 0 ++----------*--------------------------------------------------------+ time.minor_page_faults 300000 ++-----------------------------------------------------------------+ | O O O O O O O O O | 250000 O+O O O O O O O O O O | | | *.*.*..*.* *..*.*.*.*..*.*.*.*..*.*.*.*.*..*. .*.*..*.*. .*..*.*.* 200000 ++ : : * * | | : : | 150000 ++ : : | | : : | 100000 ++ : : | | : : | | : : | 50000 ++ : | | : | 0 ++---------*-------------------------------------------------------+ time.voluntary_context_switches 2e+09 ++----------------------------------------------------------------+ 1.8e+09 ++ O O O O O O | | O O O O O O O O O O O O | 1.6e+09 O+O | 1.4e+09 ++ | | | 1.2e+09 ++ | 1e+09 ++ | 8e+08 ++ .* *.*.*..*.* *.*..*.*.*.*.*..*.*.*.*.*..*.*.*.*.*..*.*.*.*.*..*.* | 6e+08 ++ : : | 4e+08 ++ : : | | : : | 2e+08 ++ : | 0 ++---------*------------------------------------------------------+ time.involuntary_context_switches 7e+08 ++------------------------------------------------------------------+ | O O O | 6e+08 ++ O O O O O O O O O O O | O O O O O O | 5e+08 ++ | | | 4e+08 ++ | | | 3e+08 ++ | | | 2e+08 ++ .*.. .*.*. .*.*.*. .* *.*..*.*.* *.*.* *.*.*.*..*.*.*..*.*.*.*. *.*. *..* | 1e+08 ++ : + | | : + | 0 ++---------*--------------------------------------------------------+ hackbench.time.user_time 1400 ++------------*------------------------------------------------------+ *.*..*.*.* : *.*..*.*.*..*.*.*..*.*.*.*..*.*.*..*.*.*..*.*.*.*..*.* 1200 ++ : : | | : : | 1000 O+O O: O:O O O O O O O O O O | | O O : : O O O O | 800 ++ : : | | : : | 600 ++ : : | | : : | 400 ++ : : | | :: | 200 ++ : | | : | 0 ++----------*--------------------------------------------------------+ hackbench.time.percent_of_cpu_this_job_got 5000 ++-------------------------------------------------------------------+ 4500 O+O O O O O O O O O O O O O O O O O O O | *.*..*.*.* *.*.*..*.*.*..*.*.*..*.*.*.*..*.*.*..*.*.*..*.*.*.*..*.* 4000 ++ : : | 3500 ++ : : | | : : | 3000 ++ : : | 2500 ++ : : | 2000 ++ : : | | : : | 1500 ++ : : | 1000 ++ : : | | : | 500 ++ : | 0 ++----------*--------------------------------------------------------+ hackbench.time.minor_page_faults 300000 ++-----------------------------------------------------------------+ | O O O O O O O O O | 250000 O+O O O O O O O O O O | | | *.*.*..*.* *..*.*.*.*..*.*.*.*..*.*.*.*.*..*. .*.*..*.*. .*..*.*.* 200000 ++ : : * * | | : : | 150000 ++ : : | | : : | 100000 ++ : : | | : : | | : : | 50000 ++ : | | : | 0 ++---------*-------------------------------------------------------+ hackbench.time.voluntary_context_switches 2e+09 ++----------------------------------------------------------------+ 1.8e+09 ++ O O O O O O | | O O O O O O O O O O O O | 1.6e+09 O+O | 1.4e+09 ++ | | | 1.2e+09 ++ | 1e+09 ++ | 8e+08 ++ .* *.*.*..*.* *.*..*.*.*.*.*..*.*.*.*.*..*.*.*.*.*..*.*.*.*.*..*.* | 6e+08 ++ : : | 4e+08 ++ : : | | : : | 2e+08 ++ : | 0 ++---------*------------------------------------------------------+ hackbench.time.involuntary_context_switches 7e+08 ++------------------------------------------------------------------+ | O O O | 6e+08 ++ O O O O O O O O O O O | O O O O O O | 5e+08 ++ | | | 4e+08 ++ | | | 3e+08 ++ | | | 2e+08 ++ .*.. .*.*. .*.*.*. .* *.*..*.*.* *.*.* *.*.*.*..*.*.*..*.*.*.*. *.*. *..* | 1e+08 ++ : + | | : + | 0 ++---------*--------------------------------------------------------+ softirqs.SCHED 3e+06 ++----------------------------------------------------------------+ | | 2.5e+06 *+*.*..*.* *.*..*.*.*.*.*..*.*.*.*.*..*.*.*.*.*.. .*. .*. | | : : *.* *.*. *.* | : : | 2e+06 ++ : : | O O O: : O O O O | 1.5e+06 ++ O O :O:O O O O O O O O O O | | : : | 1e+06 ++ : : | | : : | | :: | 500000 ++ : | | : | 0 ++---------*------------------------------------------------------+ uptime.idle 4500 ++-------------------------------------------------------------------+ *.*..*.*.* .*.*..*.*.*..*.*.*..*.*.*.*..*.*.*..*. .*. .*.*..*.* 4000 ++ : * *.*. * | 3500 ++ : : | | : : | 3000 ++ : : | 2500 O+O O: O: O O O O O O O | | O : :O O O O O O O | 2000 ++ O : : | 1500 ++ : : | | : : | 1000 ++ : : | 500 ++ : | | : | 0 ++----------*--------------------------------------------------------+ cpuidle.POLL.usage 3500 ++-------------------------------------------------------------------+ | | 3000 ++ .*. *. .*.. .*. .*.. .*. .*.. .* *.*..* * : * * *..*.* *.*.*.*. * *.*.*..*.*.*.*..* | 2500 ++ : : | | : : | 2000 ++ : : | O O : : | 1500 ++ O: : O O | | O :O:O O O O O O O O O O | 1000 ++ O : : O O | | : : | 500 ++ :: | | : | 0 ++----------*--------------------------------------------------------+ cpuidle.C1-IVT.time 1.8e+09 *+*-*--*-*---*-*--*---*-*-*--*-*-*-*-*--*-*-*---*-----------------+ | : : * * *.*.*.*.*..*.*.* 1.6e+09 ++ : : | 1.4e+09 ++ : : | | : : | 1.2e+09 ++ : : | 1e+09 ++ : : | | : : | 8e+08 O+O O: : O O O O | 6e+08 ++ O O :O:O O O O O O O O | | : : O O | 4e+08 ++ : | 2e+08 ++ : | | : | 0 ++---------*------------------------------------------------------+ cpuidle.C1-IVT.usage 1.2e+08 ++----------------------------------------------------------------+ *.*.*..*.* *.*..*.*.*.*.*..*.*.*.*.*..*.*.*.*.*.. .*. | 1e+08 ++ : : *.* *.*..*.*.* | : : | | : : | 8e+07 O+O : : | | O: : O O O O | 6e+07 ++ O O :O:O O O O O O O O | | : : O O | 4e+07 ++ : : | | : : | | : | 2e+07 ++ : | | : | 0 ++---------*------------------------------------------------------+ cpuidle.C1E-IVT.time 3e+07 ++----------------------------------------------------------------+ *.*.*..*.* *.*..*.*.*.*.*..*.*.*.*.*..*.*.*.*.*.. .*.*.*. | 2.5e+07 ++ : : * *..*.*.* | : : | | : : | 2e+07 ++ : : | | : : | 1.5e+07 ++ : : | | : : | 1e+07 ++ : : | | : : | O : | 5e+06 ++O O O O O O O O O | | O : O O O O O O O O O | 0 ++---------*------------------------------------------------------+ cpuidle.C1E-IVT.usage 350000 ++-----------------------------------------------------------------+ | .*.*. | 300000 *+*.*..*.* *. *.*..*.*.*.*..*.*.*.*.*..*.*.*.*..*.*.*.*..*.*.* | : : | 250000 ++ : : | | : : | 200000 ++ : : | | : : | 150000 ++ : : | | : : | 100000 ++ : : | | O :: | 50000 O+ O O O O O O O O O O | | O : O O O O O O O | 0 ++---------*-------------------------------------------------------+ cpuidle.C3-IVT.time 6e+07 *+*--*-*-*------*--------*-*-*--*-*-*--*-*-*-*----*-----------------+ | : * *.*..* * *.*..*.*.*.*..*.| 5e+07 ++ : : * | : : | | : : | 4e+07 ++ : : | | : : | 3e+07 ++ : : | | : : | 2e+07 ++ : : | | : : | | :: | 1e+07 O+O O O : O O O O | | O O O O O O O O O O O O | 0 ++---------*--------------------------------------------------------+ cpuidle.C3-IVT.usage 600000 ++-----------------------------------------------------------------+ | | 500000 *+*.*..*.* *..*.*.*.*..*.*.*.*..*.*.*.*.*..*.*.*. .*. | | : : *..* *.*..*.*.* | : : | 400000 ++ : : | | : : | 300000 ++ : : | | : : | 200000 ++ : : | | : : | | :: | 100000 O+O O O : O O O O O | | O O O O O O O O O O O | 0 ++---------*-------------------------------------------------------+ cpuidle.C6-IVT.time 6e+08 ++------------------------------------------------------------------+ |.*..*.*.* .*.*..*.*. .*.*.*.*.. | 5e+08 *+ : *.*.*.*..*.* *. *.*.*.*..*.*.*.*..*.* | : : | | : : | 4e+08 ++ : : | | : : | 3e+08 ++ : : | | : : | 2e+08 ++ : : | | O : : | O O O O O: O O O O O O O O O O O | 1e+08 ++ :: O O O | | : | 0 ++---------*--------------------------------------------------------+ cpuidle.C6-IVT.usage 4.5e+06 ++----------------------------------------------------------------+ *.*.*..*.* .*..*. .*.*.*..*.*.*.*.*..*.*.*.*. .*. | 4e+06 ++ : * * *..*.* *.*..*.*.* 3.5e+06 ++ : : | | : : | 3e+06 ++ : : | 2.5e+06 ++ : : | | : : | 2e+06 ++ : : | 1.5e+06 ++ : : | | : : | 1e+06 ++O :: | 500000 O+ O O O O O O O O O | | O : O O O O O O O O | 0 ++---------*------------------------------------------------------+ meminfo.Slab 200000 *+*-*--*-*---*--*-*-*-*--*-*-*-*--*-*-*-*-*--*-*-*-*--*-*-*-*--*-*-* 180000 ++O O : O O | O O O : O O O O O O O O O O O O O | 160000 ++ : : | 140000 ++ : : | | : : | 120000 ++ : : | 100000 ++ : : | 80000 ++ : : | | : : | 60000 ++ : : | 40000 ++ : | | : | 20000 ++ : | 0 ++---------*-------------------------------------------------------+ meminfo.SUnreclaim 160000 ++-----------------------------------------------------------------+ *. .*..*.* *..*.*. .*. .*.*..*.*. .*.*..*. .*. .*.*.*.*..*. .* 140000 ++* : : *.*. * * * *. * | 120000 O+O O O O O O O O O O O O O O O O O O O | | : : | 100000 ++ : : | | : : | 80000 ++ : : | | : : | 60000 ++ : : | 40000 ++ : : | | : | 20000 ++ : | | : | 0 ++---------*-------------------------------------------------------+ vmstat.system.in 1.2e+06 ++----------------------------------------------------------------+ | O O O | 1e+06 ++ O O O O O O O O O O O O | | O O O | O O | 800000 ++ | | | 600000 ++ | | | 400000 ++ | | | *.*.*..*.* *.*..*.*.*.*.*..*.*.*.*.*..*.*.*.*.*..*.*.*.*.*..*.*.* 200000 ++ : : | | : : | 0 ++---------*------------------------------------------------------+ vmstat.system.cs 4.5e+06 ++----------------------------------------------------------------+ | O O O O | 4e+06 O+ O O O O O O O O O O O O O O | 3.5e+06 ++O | | | 3e+06 ++ | 2.5e+06 ++ | | | 2e+06 ++ .*. .* 1.5e+06 *+*.*..*.* *.*..*.*.*.*.*..*.*.*.*.*..*.*.*.*.*..*.*.* *..*.* | | : : | 1e+06 ++ : : | 500000 ++ : : | | : | 0 ++---------*------------------------------------------------------+ [*] bisect-good sample [O] bisect-bad sample To reproduce: git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git cd lkp-tests bin/lkp install job.yaml # job file is attached in this email bin/lkp run job.yaml *************************************************************************************************** ivb42: 48 threads Ivytown Ivy Bridge-EP with 64G memory ========================================================================================= compiler/cpufreq_governor/kconfig/nr_task/rootfs/tbox_group/test/testcase: gcc-4.9/performance/x86_64-rhel/100%/debian-x86_64-2015-02-07.cgz/ivb42/context1/unixbench commit: c5114626f33b62fa7595e57d87f33d9d1f8298a2 53d3bc773eaa7ab1cf63585e76af7ee869d5e709 c5114626f33b62fa 53d3bc773eaa7ab1cf63585e76 ---------------- -------------------------- %stddev %change %stddev \ | \ 18006 ± 1% +25.9% 22672 ± 0% unixbench.score 39774 ± 33% +5.4e+05% 2.138e+08 ± 4% unixbench.time.involuntary_context_switches 1717 ± 0% +1.9% 1749 ± 0% unixbench.time.percent_of_cpu_this_job_got 152.51 ± 0% +33.9% 204.18 ± 1% unixbench.time.user_time 7.052e+08 ± 1% -3.9% 6.78e+08 ± 1% unixbench.time.voluntary_context_switches 4.243e+08 ± 3% -9.4% 3.845e+08 ± 7% cpuidle.C1-IVT.time 1.544e+08 ± 6% -37.5% 96475672 ± 5% cpuidle.C1-IVT.usage 409626 ± 4% +28.6% 526843 ± 15% softirqs.RCU 274815 ± 4% -27.5% 199184 ± 9% softirqs.SCHED 39774 ± 33% +5.4e+05% 2.138e+08 ± 4% time.involuntary_context_switches 152.51 ± 0% +33.9% 204.18 ± 1% time.user_time 45.25 ± 0% +12.7% 51.00 ± 0% vmstat.procs.r 11774346 ± 0% +20.2% 14152328 ± 0% vmstat.system.cs 1848728 ± 0% +22.7% 2269123 ± 0% sched_debug.cfs_rq:/.min_vruntime.avg 2029277 ± 0% +18.7% 2409509 ± 0% sched_debug.cfs_rq:/.min_vruntime.max 1561074 ± 5% +29.9% 2027122 ± 3% sched_debug.cfs_rq:/.min_vruntime.min 103209 ± 9% -17.8% 84792 ± 10% sched_debug.cfs_rq:/.min_vruntime.stddev 11.68 ± 6% -35.9% 7.49 ± 6% sched_debug.cfs_rq:/.runnable_load_avg.avg 103208 ± 9% -17.8% 84795 ± 10% sched_debug.cfs_rq:/.spread0.stddev 946393 ± 5% -24.5% 714499 ± 10% sched_debug.cpu.avg_idle.max 234059 ± 6% -36.5% 148728 ± 37% sched_debug.cpu.avg_idle.stddev 11.57 ± 6% -31.2% 7.96 ± 20% sched_debug.cpu.cpu_load[1].avg 11.61 ± 7% -34.4% 7.61 ± 12% sched_debug.cpu.cpu_load[2].avg 11.70 ± 7% -35.4% 7.56 ± 8% sched_debug.cpu.cpu_load[3].avg 11.86 ± 7% -36.1% 7.58 ± 6% sched_debug.cpu.cpu_load[4].avg 0.48 ± 6% +13.9% 0.54 ± 3% sched_debug.cpu.nr_running.avg 0.37 ± 5% +10.5% 0.41 ± 4% sched_debug.cpu.nr_running.stddev 14556348 ± 0% +20.1% 17474921 ± 0% sched_debug.cpu.nr_switches.avg 14764042 ± 0% +24.5% 18380752 ± 0% sched_debug.cpu.nr_switches.max 14296508 ± 0% +14.9% 16430231 ± 0% sched_debug.cpu.nr_switches.min 121577 ± 25% +268.4% 447878 ± 8% sched_debug.cpu.nr_switches.stddev -9.42 ± -3% +20.4% -11.33 ±-12% sched_debug.cpu.nr_uninterruptible.min *************************************************************************************************** lkp-hsw-ep4: 72 threads Haswell-EP with 128G memory ========================================================================================= compiler/cpufreq_governor/ipc/iterations/kconfig/mode/nr_threads/rootfs/tbox_group/testcase: gcc-4.9/performance/pipe/12/x86_64-rhel/process/50%/debian-x86_64-2015-02-07.cgz/lkp-hsw-ep4/hackbench commit: c5114626f33b62fa7595e57d87f33d9d1f8298a2 53d3bc773eaa7ab1cf63585e76af7ee869d5e709 c5114626f33b62fa 53d3bc773eaa7ab1cf63585e76 ---------------- -------------------------- %stddev %change %stddev \ | \ 207412 ± 0% -15.6% 175076 ± 1% hackbench.throughput 489.41 ± 0% +18.4% 579.66 ± 1% hackbench.time.elapsed_time 489.41 ± 0% +18.4% 579.66 ± 1% hackbench.time.elapsed_time.max 1.005e+09 ± 0% +113.2% 2.142e+09 ± 4% hackbench.time.involuntary_context_switches 6966 ± 0% +2.2% 7118 ± 0% hackbench.time.percent_of_cpu_this_job_got 32394 ± 0% +19.3% 38635 ± 1% hackbench.time.system_time 1700 ± 0% +54.6% 2627 ± 3% hackbench.time.user_time 3.164e+09 ± 0% +64.2% 5.195e+09 ± 3% hackbench.time.voluntary_context_switches 536.44 ± 0% +17.1% 627.97 ± 1% uptime.boot 4496 ± 1% -16.4% 3757 ± 4% uptime.idle 720.75 ± 0% +14.7% 826.75 ± 0% vmstat.procs.r 8795090 ± 0% +44.3% 12689850 ± 2% vmstat.system.cs 2115904 ± 1% -7.1% 1965559 ± 3% vmstat.system.in 49651750 ± 0% -34.1% 32710138 ± 3% numa-numastat.node0.local_node 49657590 ± 0% -34.1% 32719401 ± 3% numa-numastat.node0.numa_hit 51230886 ± 1% -37.1% 32238968 ± 4% numa-numastat.node1.local_node 51235497 ± 1% -37.1% 32241201 ± 4% numa-numastat.node1.numa_hit 16114 ± 3% +15.3% 18577 ± 2% softirqs.NET_RX 3907664 ± 1% +44.4% 5643157 ± 1% softirqs.RCU 2029740 ± 1% -67.7% 655775 ± 16% softirqs.SCHED 17332687 ± 0% +21.1% 20995794 ± 1% softirqs.TIMER 97.19 ± 0% +1.5% 98.70 ± 0% turbostat.%Busy 2694 ± 0% +1.2% 2726 ± 0% turbostat.Avg_MHz 2.58 ± 2% -56.9% 1.11 ± 7% turbostat.CPU%c1 0.22 ± 3% -14.8% 0.19 ± 2% turbostat.CPU%c6 894518 ± 5% -16.2% 749856 ± 5% numa-meminfo.node0.MemUsed 31304 ± 18% -19.8% 25116 ± 13% numa-meminfo.node0.PageTables 137230 ± 14% -13.2% 119062 ± 7% numa-meminfo.node0.Slab 77654 ± 43% +53.9% 119507 ± 2% numa-meminfo.node1.Active(anon) 676863 ± 6% +18.9% 804493 ± 5% numa-meminfo.node1.MemUsed 40040 ± 87% +102.8% 81204 ± 3% numa-meminfo.node1.Shmem 2.29 ± 8% -82.5% 0.40 ±112% perf-profile.cycles-pp.call_cpuidle 3.41 ± 8% -84.8% 0.52 ±113% perf-profile.cycles-pp.cpu_startup_entry 2.29 ± 8% -82.4% 0.40 ±112% perf-profile.cycles-pp.cpuidle_enter 2.26 ± 9% -82.4% 0.40 ±112% perf-profile.cycles-pp.cpuidle_enter_state 2.24 ± 9% -82.4% 0.40 ±112% perf-profile.cycles-pp.intel_idle 3.42 ± 7% -84.9% 0.52 ±113% perf-profile.cycles-pp.start_secondary 86451 ± 1% +9.1% 94357 ± 3% proc-vmstat.numa_hint_faults_local 1.009e+08 ± 0% -35.6% 64951081 ± 3% proc-vmstat.numa_hit 1.009e+08 ± 0% -35.6% 64941826 ± 3% proc-vmstat.numa_local 1744958 ± 0% -36.7% 1105128 ± 3% proc-vmstat.pgalloc_dma32 99309681 ± 0% -35.5% 64014721 ± 3% proc-vmstat.pgalloc_normal 1.01e+08 ± 0% -35.6% 65068018 ± 3% proc-vmstat.pgfree 489.41 ± 0% +18.4% 579.66 ± 1% time.elapsed_time 489.41 ± 0% +18.4% 579.66 ± 1% time.elapsed_time.max 1.005e+09 ± 0% +113.2% 2.142e+09 ± 4% time.involuntary_context_switches 32394 ± 0% +19.3% 38635 ± 1% time.system_time 1700 ± 0% +54.6% 2627 ± 3% time.user_time 3.164e+09 ± 0% +64.2% 5.195e+09 ± 3% time.voluntary_context_switches 7826 ± 18% -19.7% 6283 ± 13% numa-vmstat.node0.nr_page_table_pages 24938156 ± 0% -34.5% 16344223 ± 2% numa-vmstat.node0.numa_hit 24865727 ± 0% -34.6% 16268676 ± 2% numa-vmstat.node0.numa_local 19415 ± 43% +53.9% 29872 ± 2% numa-vmstat.node1.nr_active_anon 10012 ± 87% +102.5% 20273 ± 3% numa-vmstat.node1.nr_shmem 25578109 ± 2% -35.3% 16544997 ± 3% numa-vmstat.node1.numa_hit 25542618 ± 2% -35.4% 16513089 ± 3% numa-vmstat.node1.numa_local 7.39e+08 ± 1% -63.6% 2.693e+08 ± 12% cpuidle.C1-HSW.time 1.279e+08 ± 2% -75.4% 31468140 ± 20% cpuidle.C1-HSW.usage 97966635 ± 3% -38.4% 60323848 ± 6% cpuidle.C1E-HSW.time 2424496 ± 2% -54.3% 1108542 ± 10% cpuidle.C1E-HSW.usage 2168324 ± 5% -38.4% 1335858 ± 6% cpuidle.C3-HSW.time 23824 ± 2% -51.7% 11496 ± 10% cpuidle.C3-HSW.usage 133416 ± 1% -41.7% 77729 ± 10% cpuidle.C6-HSW.usage 72278 ± 96% -85.4% 10574 ± 13% cpuidle.POLL.time 7564 ± 0% -64.3% 2699 ± 13% cpuidle.POLL.usage 447972 ± 12% -77.1% 102749 ± 39% sched_debug.cfs_rq:/.MIN_vruntime.avg 23408331 ± 2% -74.0% 6077779 ± 38% sched_debug.cfs_rq:/.MIN_vruntime.max 3133258 ± 5% -75.3% 773710 ± 35% sched_debug.cfs_rq:/.MIN_vruntime.stddev 0.17 ±173% +1025.0% 1.88 ± 15% sched_debug.cfs_rq:/.load.min 4.72 ± 5% +21.2% 5.72 ± 4% sched_debug.cfs_rq:/.load_avg.min 447972 ± 12% -77.1% 102749 ± 39% sched_debug.cfs_rq:/.max_vruntime.avg 23408331 ± 2% -74.0% 6077779 ± 38% sched_debug.cfs_rq:/.max_vruntime.max 3133258 ± 5% -75.3% 773710 ± 35% sched_debug.cfs_rq:/.max_vruntime.stddev 34877232 ± 0% -16.9% 28973299 ± 2% sched_debug.cfs_rq:/.min_vruntime.avg 36136568 ± 0% -16.9% 30030834 ± 1% sched_debug.cfs_rq:/.min_vruntime.max 33553337 ± 0% -16.4% 28050567 ± 2% sched_debug.cfs_rq:/.min_vruntime.min 580186 ± 2% -26.0% 429600 ± 11% sched_debug.cfs_rq:/.min_vruntime.stddev 0.08 ±110% +710.0% 0.67 ± 21% sched_debug.cfs_rq:/.nr_running.min 0.17 ± 12% -59.6% 0.07 ± 31% sched_debug.cfs_rq:/.nr_running.stddev 25.39 ± 2% -17.8% 20.88 ± 3% sched_debug.cfs_rq:/.runnable_load_avg.max 0.44 ±173% +1002.5% 4.90 ± 13% sched_debug.cfs_rq:/.runnable_load_avg.min 4.84 ± 2% -39.4% 2.93 ± 8% sched_debug.cfs_rq:/.runnable_load_avg.stddev 952653 ± 15% -51.5% 462372 ± 50% sched_debug.cfs_rq:/.spread0.avg 2206041 ± 10% -31.4% 1514231 ± 8% sched_debug.cfs_rq:/.spread0.max 577122 ± 2% -25.8% 428166 ± 11% sched_debug.cfs_rq:/.spread0.stddev 46.85 ± 3% -34.0% 30.93 ± 24% sched_debug.cfs_rq:/.util_avg.stddev 115635 ± 1% +107.7% 240214 ± 8% sched_debug.cpu.avg_idle.avg 506560 ± 15% +83.7% 930497 ± 4% sched_debug.cpu.avg_idle.max 6833 ±131% +168.7% 18362 ± 34% sched_debug.cpu.avg_idle.min 78999 ± 9% +214.9% 248764 ± 8% sched_debug.cpu.avg_idle.stddev 290289 ± 0% +10.7% 321362 ± 0% sched_debug.cpu.clock.avg 290345 ± 0% +10.7% 321461 ± 0% sched_debug.cpu.clock.max 290230 ± 0% +10.7% 321263 ± 0% sched_debug.cpu.clock.min 34.48 ± 26% +74.7% 60.23 ± 5% sched_debug.cpu.clock.stddev 290289 ± 0% +10.7% 321362 ± 0% sched_debug.cpu.clock_task.avg 290345 ± 0% +10.7% 321461 ± 0% sched_debug.cpu.clock_task.max 290230 ± 0% +10.7% 321263 ± 0% sched_debug.cpu.clock_task.min 34.48 ± 26% +74.7% 60.23 ± 5% sched_debug.cpu.clock_task.stddev 0.50 ± 80% +865.0% 4.82 ± 7% sched_debug.cpu.cpu_load[0].min 2.00 ± 33% +155.0% 5.10 ± 6% sched_debug.cpu.cpu_load[1].min 3.31 ± 17% +59.6% 5.28 ± 6% sched_debug.cpu.cpu_load[2].min 4.28 ± 5% +28.0% 5.47 ± 4% sched_debug.cpu.cpu_load[3].min 29.69 ± 10% -21.4% 23.35 ± 4% sched_debug.cpu.cpu_load[4].max 4.39 ± 5% +24.7% 5.47 ± 4% sched_debug.cpu.cpu_load[4].min 4.99 ± 9% -30.3% 3.47 ± 5% sched_debug.cpu.cpu_load[4].stddev 1275 ± 74% +660.4% 9696 ± 35% sched_debug.cpu.curr->pid.min 2960 ± 11% -54.8% 1338 ± 39% sched_debug.cpu.curr->pid.stddev 0.22 ± 70% +935.0% 2.30 ± 30% sched_debug.cpu.load.min 0.00 ± 11% +39.0% 0.00 ± 4% sched_debug.cpu.next_balance.stddev 245043 ± 0% +12.4% 275488 ± 0% sched_debug.cpu.nr_load_updates.avg 253700 ± 0% +11.3% 282470 ± 0% sched_debug.cpu.nr_load_updates.max 242515 ± 0% +12.5% 272755 ± 0% sched_debug.cpu.nr_load_updates.min 8.93 ± 5% +12.5% 10.05 ± 2% sched_debug.cpu.nr_running.avg 29.08 ± 4% -23.2% 22.35 ± 2% sched_debug.cpu.nr_running.max 0.11 ± 70% +1970.0% 2.30 ± 26% sched_debug.cpu.nr_running.min 6.52 ± 3% -40.5% 3.88 ± 8% sched_debug.cpu.nr_running.stddev 29380032 ± 0% +62.7% 47789650 ± 1% sched_debug.cpu.nr_switches.avg 32480191 ± 0% +63.0% 52947357 ± 1% sched_debug.cpu.nr_switches.max 26568245 ± 0% +64.3% 43639487 ± 2% sched_debug.cpu.nr_switches.min 1724177 ± 1% +28.9% 2223172 ± 5% sched_debug.cpu.nr_switches.stddev 307.39 ± 7% -42.6% 176.42 ± 14% sched_debug.cpu.nr_uninterruptible.max -278.64 ±-10% -41.9% -162.00 ± -5% sched_debug.cpu.nr_uninterruptible.min 131.21 ± 6% -45.4% 71.66 ± 3% sched_debug.cpu.nr_uninterruptible.stddev 290228 ± 0% +10.7% 321261 ± 0% sched_debug.cpu_clk 286726 ± 0% +11.2% 318853 ± 0% sched_debug.ktime 290228 ± 0% +10.7% 321261 ± 0% sched_debug.sched_clk Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. Thanks, Ying Huang