public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* performance changes on 4400755e:  200.0% fsmark.files_per_sec, -18.1% fsmark.files_per_sec, and few more
@ 2015-03-18  5:00 Yuanahn Liu
  2015-03-25  3:03 ` NeilBrown
  0 siblings, 1 reply; 3+ messages in thread
From: Yuanahn Liu @ 2015-03-18  5:00 UTC (permalink / raw)
  To: NeilBrown; +Cc: lkp, lkp, LKML, Chris Mason

Hi,

FYI, we noticed performance changes on `fsmark.files_per_sec' by 4400755e356f9a2b0b7ceaa02f57b1c7546c3765:

    > commit 4400755e356f9a2b0b7ceaa02f57b1c7546c3765
    > Author:     NeilBrown <neilb@suse.de>
    > AuthorDate: Thu Feb 26 12:47:56 2015 +1100
    > Commit:     NeilBrown <neilb@suse.de>
    > CommitDate: Wed Mar 4 13:40:19 2015 +1100
    > 
    >     md/raid5: allow the stripe_cache to grow and shrink.

26089f4902595a2f64c512066af07af6e82eb096     4400755e356f9a2b0b7ceaa02f57b1c7546c3765
----------------------------------------     ----------------------------------------
run time(m)     metric_value     ±stddev     run time(m)     metric_value     ±stddev     change   testbox/benchmark/sub-testcase
--- ------  ----------------------------     --- ------  ----------------------------     -------- ------------------------------
3   18.6               6.400     ±0.0%       5   9.2               19.200     ±0.0%         200.0% ivb44/fsmark/1x-64t-3HDD-RAID5-ext4-4M-40G-fsyncBeforeClose
3   24.7               6.400     ±0.0%       3   13.7              12.800     ±0.0%         100.0% ivb44/fsmark/1x-64t-3HDD-RAID5-f2fs-4M-40G-fsyncBeforeClose
3   17.5              28.267     ±9.6%       3   12.3              42.833     ±6.5%          51.5% ivb44/fsmark/1x-64t-3HDD-RAID5-f2fs-4M-120G-NoSync
3   16.7              30.700     ±1.5%       3   12.6              40.733     ±2.4%          32.7% ivb44/fsmark/1x-1t-3HDD-RAID5-f2fs-4M-120G-NoSync
3   29.0               5.867     ±0.8%       5   23.6               7.240     ±0.7%          23.4% ivb44/fsmark/1x-1t-3HDD-RAID5-btrfs-4M-40G-fsyncBeforeClose
3   28.5               6.000     ±0.0%       3   23.2               7.367     ±0.6%          22.8% ivb44/fsmark/1x-1t-3HDD-RAID5-f2fs-4M-40G-fsyncBeforeClose
5   11.7              14.600     ±0.0%       5   9.7               17.500     ±0.4%          19.9% ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-40G-fsyncBeforeClose
3   22.4              25.600     ±0.0%       5   17.9              30.120     ±4.1%          17.7% ivb44/fsmark/1x-64t-3HDD-RAID5-xfs-4M-120G-NoSync
5   10.8              47.320     ±0.6%       5   9.3               54.820     ±0.2%          15.8% ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-120G-NoSync
1   0.5              252.400     ±0.0%       1   0.5              263.300     ±0.0%           4.3% ivb44/fsmark/1x-1t-4BRD_12G-RAID5-ext4-4M-30G-NoSync

3   0.5              273.100     ±4.3%       3   0.6              223.567     ±6.5%         -18.1% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-btrfs-4M-30G-NoSync
3   8.1               63.133     ±0.5%       3   9.2               55.633     ±0.2%         -11.9% ivb44/fsmark/1x-1t-3HDD-RAID5-btrfs-4M-120G-NoSync
3   8.2               64.000     ±0.0%       3   9.2               57.600     ±0.0%         -10.0% ivb44/fsmark/1x-64t-3HDD-RAID5-btrfs-4M-120G-NoSync


NOTE: here are some more info about those test parameters for you to
      understand the testcase better:

      1x: where 'x' means iterations or loop, corresponding to the 'L' option of fsmark
      1t, 64t: where 't' means thread
      4M: means the single file size, corresponding to the '-s' option of fsmark
      40G, 30G, 120G: means the total test size

      4BRD_12G: BRD is the ramdisk, where '4' means 4 ramdisk, and where '12G' means
                the size of one ramdisk. So, it would be 48G in total. And we made a
		raid on those ramdisk.


As you can see from above data, interestingly, all performance
regressions come from btrfs testing. That's why Chris is also
in the cc list, with which just FYI.


FYI, here I listed more detailed changes for the maximal postive and negtive changes.

more detailed changes about ivb44/fsmark/1x-64t-3HDD-RAID5-ext4-4M-40G-fsyncBeforeClose
---------

26089f4902595a2f  4400755e356f9a2b0b7ceaa02f  
----------------  --------------------------  
         %stddev     %change         %stddev
             \          |                \  
      6.40 ±  0%    +200.0%      19.20 ±  0%  fsmark.files_per_sec
 1.015e+08 ±  1%     -73.6%   26767355 ±  3%  fsmark.time.voluntary_context_switches
     13793 ±  1%     -73.9%       3603 ±  5%  fsmark.time.system_time
     78473 ±  6%     -64.3%      28016 ±  7%  fsmark.time.involuntary_context_switches
  15789555 ±  9%     -54.7%    7159485 ± 13%  fsmark.app_overhead
      1115 ±  0%     -50.3%        554 ±  1%  fsmark.time.elapsed_time.max
      1115 ±  0%     -50.3%        554 ±  1%  fsmark.time.elapsed_time
      1235 ±  2%     -47.5%        649 ±  3%  fsmark.time.percent_of_cpu_this_job_got
    456465 ±  1%     -26.7%     334594 ±  4%  fsmark.time.minor_page_faults
       275 ±  0%   +1257.7%       3733 ±  2%  slabinfo.raid5-md0.num_objs
       275 ±  0%   +1257.7%       3733 ±  2%  slabinfo.raid5-md0.active_objs
        11 ±  0%   +1250.9%        148 ±  2%  slabinfo.raid5-md0.active_slabs
        11 ±  0%   +1250.9%        148 ±  2%  slabinfo.raid5-md0.num_slabs
      2407 ±  4%    +293.4%       9471 ± 26%  numa-meminfo.node0.Writeback
       600 ±  4%    +294.9%       2372 ± 26%  numa-vmstat.node0.nr_writeback
   1114505 ±  0%     -77.4%     251696 ±  2%  softirqs.TASKLET
   1808027 ±  1%     -77.7%     402378 ±  4%  softirqs.RCU
  12158665 ±  1%     -77.1%    2786069 ±  4%  cpuidle.C3-IVT.usage
   1119433 ±  0%     -77.3%     254192 ±  2%  softirqs.BLOCK
  37824202 ±  1%     -75.1%    9405078 ±  4%  cpuidle.C6-IVT.usage
 1.015e+08 ±  1%     -73.6%   26767355 ±  3%  time.voluntary_context_switches
     13793 ±  1%     -73.9%       3603 ±  5%  time.system_time
   5971084 ±  1%     -73.6%    1574912 ±  5%  softirqs.SCHED
  10539492 ±  3%     -72.0%    2956258 ±  6%  cpuidle.C1E-IVT.usage
         2 ±  0%    +230.0%          6 ± 12%  vmstat.procs.b
     14064 ±  1%     -71.2%       4049 ±  6%  softirqs.HRTIMER
   7388306 ±  1%     -71.2%    2129929 ±  4%  softirqs.TIMER
 3.496e+09 ±  1%     -70.3%   1.04e+09 ±  1%  cpuidle.C3-IVT.time
      0.88 ±  6%    +224.9%       2.87 ± 11%  turbostat.Pkg%pc6
  19969464 ±  2%     -66.2%    6750675 ±  5%  cpuidle.C1-IVT.usage
     78473 ±  6%     -64.3%      28016 ±  7%  time.involuntary_context_switches
      4.23 ±  5%    +181.4%      11.90 ±  3%  turbostat.Pkg%pc2
 2.551e+09 ±  1%     -61.4%  9.837e+08 ±  3%  cpuidle.C1E-IVT.time
      8084 ±  3%    +142.6%      19608 ±  3%  meminfo.Writeback
      2026 ±  4%    +141.6%       4895 ±  4%  proc-vmstat.nr_writeback
       165 ±  4%     -56.9%         71 ± 14%  numa-vmstat.node1.nr_inactive_anon
 7.748e+09 ±  3%     -50.3%  3.852e+09 ±  3%  cpuidle.C1-IVT.time
       175 ±  5%     -53.2%         82 ± 13%  numa-vmstat.node1.nr_shmem
      1115 ±  0%     -50.3%        554 ±  1%  time.elapsed_time.max
      1115 ±  0%     -50.3%        554 ±  1%  time.elapsed_time
      1147 ±  0%     -49.0%        585 ±  1%  uptime.boot
   2260889 ±  0%     -48.8%    1157272 ±  1%  proc-vmstat.pgfree
     16805 ±  2%     -35.9%      10776 ± 23%  numa-vmstat.node1.nr_dirty
      1235 ±  2%     -47.5%        649 ±  3%  time.percent_of_cpu_this_job_got
     67245 ±  2%     -35.9%      43122 ± 23%  numa-meminfo.node1.Dirty
     39041 ±  0%     -45.7%      21212 ±  2%  uptime.idle
        13 ±  9%     -49.0%          6 ± 11%  vmstat.procs.r
      3072 ± 10%     -40.3%       1833 ±  9%  cpuidle.POLL.usage
   3045115 ±  0%     -46.1%    1642053 ±  1%  proc-vmstat.pgfault
       202 ±  1%     -45.2%        110 ±  0%  proc-vmstat.nr_inactive_anon
   4583079 ±  2%     -31.4%    3143602 ± 16%  numa-vmstat.node1.numa_hit
     28.03 ±  0%     +69.1%      47.39 ±  1%  turbostat.CPU%c6
       223 ±  1%     -41.1%        131 ±  1%  proc-vmstat.nr_shmem
   4518820 ±  3%     -30.8%    3128304 ± 16%  numa-vmstat.node1.numa_local
   3363496 ±  3%     -27.4%    2441619 ± 20%  numa-vmstat.node1.nr_dirtied
   3345346 ±  3%     -27.4%    2428396 ± 20%  numa-vmstat.node1.nr_written
      0.18 ± 18%    +105.6%       0.37 ± 36%  turbostat.Pkg%pc3
   3427913 ±  3%     -27.3%    2492563 ± 20%  numa-vmstat.node1.nr_inactive_file
  13712431 ±  3%     -27.3%    9971152 ± 20%  numa-meminfo.node1.Inactive
  13711768 ±  3%     -27.3%    9970866 ± 20%  numa-meminfo.node1.Inactive(file)
   3444598 ±  3%     -27.2%    2508920 ± 20%  numa-vmstat.node1.nr_file_pages
  13778510 ±  3%     -27.2%   10036287 ± 20%  numa-meminfo.node1.FilePages
   8819175 ±  1%     -28.3%    6320188 ± 19%  numa-numastat.node1.numa_hit
   8819051 ±  1%     -28.3%    6320152 ± 19%  numa-numastat.node1.local_node
  14350918 ±  3%     -26.8%   10504070 ± 19%  numa-meminfo.node1.MemUsed
    100892 ±  3%     -26.0%      74623 ± 19%  numa-vmstat.node1.nr_slab_reclaimable
    403571 ±  3%     -26.0%     298513 ± 19%  numa-meminfo.node1.SReclaimable
      3525 ± 13%     +36.6%       4817 ± 14%  slabinfo.blkdev_requests.active_objs
      3552 ± 13%     +36.3%       4841 ± 14%  slabinfo.blkdev_requests.num_objs
     30779 ±  4%     -34.7%      20084 ± 12%  proc-vmstat.pgmigrate_success
     30779 ±  4%     -34.7%      20084 ± 12%  proc-vmstat.numa_pages_migrated
    447400 ±  2%     -23.2%     343701 ± 16%  numa-meminfo.node1.Slab
 2.532e+10 ±  0%     -33.1%  1.694e+10 ±  1%  cpuidle.C6-IVT.time
      3081 ±  9%     +28.0%       3945 ± 12%  slabinfo.mnt_cache.num_objs
      3026 ±  9%     +28.8%       3898 ± 12%  slabinfo.mnt_cache.active_objs
      5822 ±  4%     +77.8%      10350 ± 25%  numa-meminfo.node1.Writeback
      1454 ±  4%     +77.3%       2579 ± 25%  numa-vmstat.node1.nr_writeback
    424984 ±  1%     -26.5%     312255 ±  3%  proc-vmstat.numa_pte_updates
    368001 ±  1%     -26.8%     269440 ±  3%  proc-vmstat.numa_hint_faults
    456465 ±  1%     -26.7%     334594 ±  4%  time.minor_page_faults
      3.86 ±  3%     -24.4%       2.92 ±  2%  turbostat.CPU%c3
   4661151 ±  2%     +20.6%    5622999 ±  9%  numa-vmstat.node1.nr_free_pages
  18644452 ±  2%     +20.6%   22491300 ±  9%  numa-meminfo.node1.MemFree
       876 ±  2%     +28.2%       1124 ±  5%  slabinfo.kmalloc-4096.num_objs
       858 ±  3%     +24.0%       1064 ±  5%  slabinfo.kmalloc-4096.active_objs
  17767832 ±  8%     -25.4%   13249545 ± 17%  cpuidle.POLL.time
    285093 ±  1%     -23.1%     219372 ±  5%  proc-vmstat.numa_hint_faults_local
    105423 ±  2%     -16.1%      88498 ±  0%  meminfo.Dirty
     26365 ±  1%     -16.0%      22152 ±  1%  proc-vmstat.nr_dirty
     41.04 ±  1%     -14.1%      35.26 ±  1%  turbostat.CPU%c1
      9385 ±  4%     -14.3%       8043 ±  6%  slabinfo.kmalloc-192.active_objs
      9574 ±  3%     -13.9%       8241 ±  6%  slabinfo.kmalloc-192.num_objs
      2411 ±  3%     +17.0%       2820 ±  4%  slabinfo.kmalloc-2048.active_objs
  12595574 ±  0%     -10.0%   11338368 ±  1%  proc-vmstat.pgalloc_normal
      5262 ±  1%     +13.3%       5962 ±  1%  slabinfo.kmalloc-1024.num_objs
      5262 ±  1%     +12.7%       5932 ±  1%  slabinfo.kmalloc-1024.active_objs
      2538 ±  3%     +13.7%       2885 ±  4%  slabinfo.kmalloc-2048.num_objs
   5299546 ±  0%      -9.9%    4776351 ±  0%  slabinfo.buffer_head.active_objs
   5299546 ±  0%      -9.9%    4776351 ±  0%  slabinfo.buffer_head.num_objs
    135885 ±  0%      -9.9%     122470 ±  0%  slabinfo.buffer_head.num_slabs
    135885 ±  0%      -9.9%     122470 ±  0%  slabinfo.buffer_head.active_slabs
     28.04 ±  2%    +715.6%     228.69 ±  3%  iostat.sdb.avgrq-sz
     28.05 ±  2%    +708.1%     226.72 ±  2%  iostat.sdc.avgrq-sz
      2245 ±  3%     -81.6%        413 ±  1%  iostat.sda.w/s
      5.33 ±  1%   +1008.2%      59.07 ±  1%  iostat.sda.w_await
      5.85 ±  1%   +1126.4%      71.69 ±  4%  iostat.sda.r_await
      5.36 ±  1%    +978.6%      57.79 ±  3%  iostat.sdc.w_await
      1263 ±  4%     -85.8%        179 ±  6%  iostat.sdc.r/s
      2257 ±  3%     -81.6%        414 ±  2%  iostat.sdb.w/s
      1264 ±  4%     -85.8%        179 ±  6%  iostat.sdb.r/s
      5.55 ±  0%   +1024.2%      62.37 ±  4%  iostat.sdb.await
      5.89 ±  1%   +1125.9%      72.16 ±  6%  iostat.sdb.r_await
      5.36 ±  0%   +1014.3%      59.75 ±  3%  iostat.sdb.w_await
      5.57 ±  1%    +987.9%      60.55 ±  3%  iostat.sdc.await
      5.51 ±  0%   +1017.3%      61.58 ±  1%  iostat.sda.await
      1264 ±  4%     -85.8%        179 ±  6%  iostat.sda.r/s
     28.09 ±  2%    +714.2%     228.73 ±  2%  iostat.sda.avgrq-sz
      5.95 ±  2%   +1091.0%      70.82 ±  6%  iostat.sdc.r_await
      2252 ±  3%     -81.5%        417 ±  2%  iostat.sdc.w/s
      4032 ±  2%    +151.6%      10143 ±  1%  iostat.sdb.wrqm/s
      4043 ±  2%    +151.0%      10150 ±  1%  iostat.sda.wrqm/s
      4035 ±  2%    +151.2%      10138 ±  1%  iostat.sdc.wrqm/s
     26252 ±  1%     -54.0%      12077 ±  4%  vmstat.system.in
     37813 ±  0%    +101.0%      75998 ±  1%  vmstat.io.bo
     37789 ±  0%    +101.0%      75945 ±  1%  iostat.md0.wkB/s
       205 ±  0%     +96.1%        402 ±  1%  iostat.md0.w/s
    164286 ±  1%     -46.2%      88345 ±  2%  vmstat.system.cs
     27.07 ±  2%     -46.7%      14.42 ±  3%  turbostat.%Busy
       810 ±  2%     -46.7%        431 ±  3%  turbostat.Avg_MHz
     15.56 ±  2%     +71.7%      26.71 ±  1%  iostat.sda.avgqu-sz
     15.65 ±  2%     +69.1%      26.46 ±  2%  iostat.sdc.avgqu-sz
     15.67 ±  2%     +72.7%      27.06 ±  2%  iostat.sdb.avgqu-sz
     25151 ±  0%     +68.3%      42328 ±  1%  iostat.sda.wkB/s
     25153 ±  0%     +68.2%      42305 ±  1%  iostat.sdb.wkB/s
     25149 ±  0%     +68.2%      42292 ±  1%  iostat.sdc.wkB/s
     97.45 ±  0%     -21.1%      76.90 ±  0%  turbostat.CorWatt
     12517 ±  0%     -20.2%       9994 ±  1%  iostat.sdc.rkB/s
     12517 ±  0%     -20.0%      10007 ±  1%  iostat.sda.rkB/s
     12512 ±  0%     -19.9%      10018 ±  1%  iostat.sdb.rkB/s
      1863 ±  3%     +24.7%       2325 ±  1%  iostat.sdb.rrqm/s
      1865 ±  3%     +24.3%       2319 ±  1%  iostat.sdc.rrqm/s
      1864 ±  3%     +24.6%       2322 ±  1%  iostat.sda.rrqm/s
       128 ±  0%     -16.4%        107 ±  0%  turbostat.PkgWatt
    150569 ±  0%      -8.7%     137525 ±  0%  iostat.md0.avgqu-sz
      4.29 ±  0%      -5.1%       4.07 ±  0%  turbostat.RAMWatt


more detailed changes about ivb44/fsmark/1x-64t-4BRD_12G-RAID5-btrfs-4M-30G-NoSync
---------

26089f4902595a2f  4400755e356f9a2b0b7ceaa02f  
----------------  --------------------------  
         %stddev     %change         %stddev
             \          |                \  
       273 ±  4%     -18.1%        223 ±  6%  fsmark.files_per_sec
     29.24 ±  1%     +27.2%      37.20 ±  8%  fsmark.time.elapsed_time.max
     29.24 ±  1%     +27.2%      37.20 ±  8%  fsmark.time.elapsed_time
       399 ±  4%     -20.0%        319 ±  3%  fsmark.time.percent_of_cpu_this_job_got
    129891 ± 20%     -28.9%      92334 ± 15%  fsmark.time.voluntary_context_switches
       266 ±  0%    +413.4%       1365 ±  5%  slabinfo.raid5-md0.num_objs
       266 ±  0%    +413.4%       1365 ±  5%  slabinfo.raid5-md0.active_objs
      0.23 ± 27%     +98.6%       0.46 ± 35%  turbostat.CPU%c3
  56612063 ±  9%     +36.7%   77369763 ± 20%  cpuidle.C1-IVT.time
   5579498 ± 14%     -36.0%    3571516 ±  6%  cpuidle.C1E-IVT.time
      4668 ± 38%     +64.7%       7690 ± 19%  numa-vmstat.node0.nr_unevictable
     18674 ± 38%     +64.7%      30762 ± 19%  numa-meminfo.node0.Unevictable
      9298 ± 37%     +64.4%      15286 ± 19%  proc-vmstat.nr_unevictable
      4629 ± 37%     +64.1%       7596 ± 19%  numa-vmstat.node1.nr_unevictable
     18535 ± 37%     +63.9%      30385 ± 19%  numa-meminfo.node1.Unevictable
   4270894 ± 19%     +65.6%    7070923 ± 21%  cpuidle.C3-IVT.time
     38457 ± 37%     +59.0%      61148 ± 19%  meminfo.Unevictable
   3748226 ± 17%     +26.6%    4743674 ± 16%  numa-vmstat.node0.numa_local
   4495283 ± 13%     -24.8%    3382315 ± 17%  numa-vmstat.node0.nr_free_pages
   3818432 ± 16%     +26.5%    4830938 ± 16%  numa-vmstat.node0.numa_hit
  17966826 ± 13%     -24.7%   13537228 ± 17%  numa-meminfo.node0.MemFree
  14901309 ± 15%     +29.7%   19330906 ± 12%  numa-meminfo.node0.MemUsed
        26 ± 21%     -32.9%         17 ± 14%  cpuidle.POLL.usage
 1.183e+09 ±  1%     +29.6%  1.533e+09 ±  8%  cpuidle.C6-IVT.time
     29.24 ±  1%     +27.2%      37.20 ±  8%  time.elapsed_time
     29.24 ±  1%     +27.2%      37.20 ±  8%  time.elapsed_time.max
       399 ±  4%     -20.0%        319 ±  3%  time.percent_of_cpu_this_job_got
       850 ±  4%      -8.6%        777 ±  5%  slabinfo.blkdev_requests.num_objs
       850 ±  4%      -8.6%        777 ±  5%  slabinfo.blkdev_requests.active_objs
     14986 ±  9%     +17.1%      17548 ±  8%  numa-vmstat.node0.nr_slab_reclaimable
     11943 ±  5%     -12.6%      10441 ±  2%  slabinfo.kmalloc-192.num_objs
     59986 ±  9%     +17.0%      70186 ±  8%  numa-meminfo.node0.SReclaimable
      3703 ±  6%     +10.2%       4082 ±  7%  slabinfo.btrfs_delayed_data_ref.num_objs
    133551 ±  6%     +16.1%     154995 ±  1%  proc-vmstat.pgfault
    129891 ± 20%     -28.9%      92334 ± 15%  time.voluntary_context_switches
     11823 ±  4%     -12.0%      10409 ±  3%  slabinfo.kmalloc-192.active_objs
      3703 ±  6%      +9.7%       4061 ±  7%  slabinfo.btrfs_delayed_data_ref.active_objs
     19761 ±  2%     -11.2%      17542 ±  6%  slabinfo.anon_vma.active_objs
     19761 ±  2%     -11.2%      17544 ±  6%  slabinfo.anon_vma.num_objs
     13002 ±  3%     +14.9%      14944 ±  5%  slabinfo.kmalloc-256.num_objs
     12695 ±  3%     +13.8%      14446 ±  7%  slabinfo.kmalloc-256.active_objs
      1190 ±  1%     -11.8%       1050 ±  3%  slabinfo.mnt_cache.num_objs
      1190 ±  1%     -11.8%       1050 ±  3%  slabinfo.mnt_cache.active_objs
    136862 ±  1%     -13.8%     117938 ±  7%  cpuidle.C6-IVT.usage
   1692630 ±  3%     +12.3%    1900854 ±  0%  numa-vmstat.node0.nr_written
      1056 ±  2%      +8.8%       1149 ±  3%  slabinfo.mm_struct.active_objs
      1056 ±  2%      +8.8%       1149 ±  3%  slabinfo.mm_struct.num_objs
     24029 ± 11%     -30.6%      16673 ±  8%  vmstat.system.cs
      8859 ±  2%     -15.0%       7530 ±  8%  vmstat.system.in
    905630 ±  2%     -16.8%     753097 ±  4%  iostat.md0.wkB/s
    906433 ±  2%     -16.9%     753482 ±  4%  vmstat.io.bo
      3591 ±  2%     -16.9%       2982 ±  4%  iostat.md0.w/s
     13.22 ±  5%     -16.3%      11.07 ±  1%  turbostat.%Busy
       402 ±  4%     -15.9%        338 ±  1%  turbostat.Avg_MHz
     54236 ±  3%     +10.4%      59889 ±  4%  iostat.md0.avgqu-sz
      7.67 ±  1%      +4.5%       8.01 ±  1%  turbostat.RAMWatt



	--yliu

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: performance changes on 4400755e:  200.0% fsmark.files_per_sec, -18.1% fsmark.files_per_sec, and few more
  2015-03-18  5:00 performance changes on 4400755e: 200.0% fsmark.files_per_sec, -18.1% fsmark.files_per_sec, and few more Yuanahn Liu
@ 2015-03-25  3:03 ` NeilBrown
  2015-03-26  4:30   ` Yuanhan Liu
  0 siblings, 1 reply; 3+ messages in thread
From: NeilBrown @ 2015-03-25  3:03 UTC (permalink / raw)
  To: Yuanahn Liu; +Cc: lkp, lkp, LKML, Chris Mason

[-- Attachment #1: Type: text/plain, Size: 21119 bytes --]

On Wed, 18 Mar 2015 13:00:30 +0800 Yuanahn Liu <yuanhan.liu@linux.intel.com>
wrote:

> Hi,
> 
> FYI, we noticed performance changes on `fsmark.files_per_sec' by 4400755e356f9a2b0b7ceaa02f57b1c7546c3765:
> 
>     > commit 4400755e356f9a2b0b7ceaa02f57b1c7546c3765
>     > Author:     NeilBrown <neilb@suse.de>
>     > AuthorDate: Thu Feb 26 12:47:56 2015 +1100
>     > Commit:     NeilBrown <neilb@suse.de>
>     > CommitDate: Wed Mar 4 13:40:19 2015 +1100
>     > 
>     >     md/raid5: allow the stripe_cache to grow and shrink.

Thanks a lot for this testing!!! I was wondering how I could do some proper
testing of this patch, and you've done it for me :-)

The large number of improvements is very encouraging - that is what I was
hoping for of course.

The few regressions could be a concern.  I note that are all NoSync.
That seems to suggest that they could just be writing more data.
i.e. the data is written a bit earlier (certainly possible) so it happen to
introduce more delay ....

I guess I'm not really sure how to interpret NoSync results, and suspect that
poor NoSync result don't really reflect much on the underlying block device.
Could that be right?

Also, I'm a little confused by the
   fsmark.time.involuntary_context_switches
statistic:

>       1235 ±  2%     -47.5%        649 ±  3%  fsmark.time.percent_of_cpu_this_job_got

>        399 ±  4%     -20.0%        319 ±  3%  fsmark.time.percent_of_cpu_this_job_got

Does that means that the ext4 test changed from 12.4 cpus to 6.4, and that
the btrfs test chnages from 4 cpus to 3.2 ???

Or does it just not mean anything?

Thanks,
NeilBrown




> 
> 26089f4902595a2f64c512066af07af6e82eb096     4400755e356f9a2b0b7ceaa02f57b1c7546c3765
> ----------------------------------------     ----------------------------------------
> run time(m)     metric_value     ±stddev     run time(m)     metric_value     ±stddev     change   testbox/benchmark/sub-testcase
> --- ------  ----------------------------     --- ------  ----------------------------     -------- ------------------------------
> 3   18.6               6.400     ±0.0%       5   9.2               19.200     ±0.0%         200.0% ivb44/fsmark/1x-64t-3HDD-RAID5-ext4-4M-40G-fsyncBeforeClose
> 3   24.7               6.400     ±0.0%       3   13.7              12.800     ±0.0%         100.0% ivb44/fsmark/1x-64t-3HDD-RAID5-f2fs-4M-40G-fsyncBeforeClose
> 3   17.5              28.267     ±9.6%       3   12.3              42.833     ±6.5%          51.5% ivb44/fsmark/1x-64t-3HDD-RAID5-f2fs-4M-120G-NoSync
> 3   16.7              30.700     ±1.5%       3   12.6              40.733     ±2.4%          32.7% ivb44/fsmark/1x-1t-3HDD-RAID5-f2fs-4M-120G-NoSync
> 3   29.0               5.867     ±0.8%       5   23.6               7.240     ±0.7%          23.4% ivb44/fsmark/1x-1t-3HDD-RAID5-btrfs-4M-40G-fsyncBeforeClose
> 3   28.5               6.000     ±0.0%       3   23.2               7.367     ±0.6%          22.8% ivb44/fsmark/1x-1t-3HDD-RAID5-f2fs-4M-40G-fsyncBeforeClose
> 5   11.7              14.600     ±0.0%       5   9.7               17.500     ±0.4%          19.9% ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-40G-fsyncBeforeClose
> 3   22.4              25.600     ±0.0%       5   17.9              30.120     ±4.1%          17.7% ivb44/fsmark/1x-64t-3HDD-RAID5-xfs-4M-120G-NoSync
> 5   10.8              47.320     ±0.6%       5   9.3               54.820     ±0.2%          15.8% ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-120G-NoSync
> 1   0.5              252.400     ±0.0%       1   0.5              263.300     ±0.0%           4.3% ivb44/fsmark/1x-1t-4BRD_12G-RAID5-ext4-4M-30G-NoSync
> 
> 3   0.5              273.100     ±4.3%       3   0.6              223.567     ±6.5%         -18.1% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-btrfs-4M-30G-NoSync
> 3   8.1               63.133     ±0.5%       3   9.2               55.633     ±0.2%         -11.9% ivb44/fsmark/1x-1t-3HDD-RAID5-btrfs-4M-120G-NoSync
> 3   8.2               64.000     ±0.0%       3   9.2               57.600     ±0.0%         -10.0% ivb44/fsmark/1x-64t-3HDD-RAID5-btrfs-4M-120G-NoSync
> 
> 
> NOTE: here are some more info about those test parameters for you to
>       understand the testcase better:
> 
>       1x: where 'x' means iterations or loop, corresponding to the 'L' option of fsmark
>       1t, 64t: where 't' means thread
>       4M: means the single file size, corresponding to the '-s' option of fsmark
>       40G, 30G, 120G: means the total test size
> 
>       4BRD_12G: BRD is the ramdisk, where '4' means 4 ramdisk, and where '12G' means
>                 the size of one ramdisk. So, it would be 48G in total. And we made a
> 		raid on those ramdisk.
> 
> 
> As you can see from above data, interestingly, all performance
> regressions come from btrfs testing. That's why Chris is also
> in the cc list, with which just FYI.
> 
> 
> FYI, here I listed more detailed changes for the maximal postive and negtive changes.
> 
> more detailed changes about ivb44/fsmark/1x-64t-3HDD-RAID5-ext4-4M-40G-fsyncBeforeClose
> ---------
> 
> 26089f4902595a2f  4400755e356f9a2b0b7ceaa02f  
> ----------------  --------------------------  
>          %stddev     %change         %stddev
>              \          |                \  
>       6.40 ±  0%    +200.0%      19.20 ±  0%  fsmark.files_per_sec
>  1.015e+08 ±  1%     -73.6%   26767355 ±  3%  fsmark.time.voluntary_context_switches
>      13793 ±  1%     -73.9%       3603 ±  5%  fsmark.time.system_time
>      78473 ±  6%     -64.3%      28016 ±  7%  fsmark.time.involuntary_context_switches
>   15789555 ±  9%     -54.7%    7159485 ± 13%  fsmark.app_overhead
>       1115 ±  0%     -50.3%        554 ±  1%  fsmark.time.elapsed_time.max
>       1115 ±  0%     -50.3%        554 ±  1%  fsmark.time.elapsed_time
>       1235 ±  2%     -47.5%        649 ±  3%  fsmark.time.percent_of_cpu_this_job_got
>     456465 ±  1%     -26.7%     334594 ±  4%  fsmark.time.minor_page_faults
>        275 ±  0%   +1257.7%       3733 ±  2%  slabinfo.raid5-md0.num_objs
>        275 ±  0%   +1257.7%       3733 ±  2%  slabinfo.raid5-md0.active_objs
>         11 ±  0%   +1250.9%        148 ±  2%  slabinfo.raid5-md0.active_slabs
>         11 ±  0%   +1250.9%        148 ±  2%  slabinfo.raid5-md0.num_slabs
>       2407 ±  4%    +293.4%       9471 ± 26%  numa-meminfo.node0.Writeback
>        600 ±  4%    +294.9%       2372 ± 26%  numa-vmstat.node0.nr_writeback
>    1114505 ±  0%     -77.4%     251696 ±  2%  softirqs.TASKLET
>    1808027 ±  1%     -77.7%     402378 ±  4%  softirqs.RCU
>   12158665 ±  1%     -77.1%    2786069 ±  4%  cpuidle.C3-IVT.usage
>    1119433 ±  0%     -77.3%     254192 ±  2%  softirqs.BLOCK
>   37824202 ±  1%     -75.1%    9405078 ±  4%  cpuidle.C6-IVT.usage
>  1.015e+08 ±  1%     -73.6%   26767355 ±  3%  time.voluntary_context_switches
>      13793 ±  1%     -73.9%       3603 ±  5%  time.system_time
>    5971084 ±  1%     -73.6%    1574912 ±  5%  softirqs.SCHED
>   10539492 ±  3%     -72.0%    2956258 ±  6%  cpuidle.C1E-IVT.usage
>          2 ±  0%    +230.0%          6 ± 12%  vmstat.procs.b
>      14064 ±  1%     -71.2%       4049 ±  6%  softirqs.HRTIMER
>    7388306 ±  1%     -71.2%    2129929 ±  4%  softirqs.TIMER
>  3.496e+09 ±  1%     -70.3%   1.04e+09 ±  1%  cpuidle.C3-IVT.time
>       0.88 ±  6%    +224.9%       2.87 ± 11%  turbostat.Pkg%pc6
>   19969464 ±  2%     -66.2%    6750675 ±  5%  cpuidle.C1-IVT.usage
>      78473 ±  6%     -64.3%      28016 ±  7%  time.involuntary_context_switches
>       4.23 ±  5%    +181.4%      11.90 ±  3%  turbostat.Pkg%pc2
>  2.551e+09 ±  1%     -61.4%  9.837e+08 ±  3%  cpuidle.C1E-IVT.time
>       8084 ±  3%    +142.6%      19608 ±  3%  meminfo.Writeback
>       2026 ±  4%    +141.6%       4895 ±  4%  proc-vmstat.nr_writeback
>        165 ±  4%     -56.9%         71 ± 14%  numa-vmstat.node1.nr_inactive_anon
>  7.748e+09 ±  3%     -50.3%  3.852e+09 ±  3%  cpuidle.C1-IVT.time
>        175 ±  5%     -53.2%         82 ± 13%  numa-vmstat.node1.nr_shmem
>       1115 ±  0%     -50.3%        554 ±  1%  time.elapsed_time.max
>       1115 ±  0%     -50.3%        554 ±  1%  time.elapsed_time
>       1147 ±  0%     -49.0%        585 ±  1%  uptime.boot
>    2260889 ±  0%     -48.8%    1157272 ±  1%  proc-vmstat.pgfree
>      16805 ±  2%     -35.9%      10776 ± 23%  numa-vmstat.node1.nr_dirty
>       1235 ±  2%     -47.5%        649 ±  3%  time.percent_of_cpu_this_job_got
>      67245 ±  2%     -35.9%      43122 ± 23%  numa-meminfo.node1.Dirty
>      39041 ±  0%     -45.7%      21212 ±  2%  uptime.idle
>         13 ±  9%     -49.0%          6 ± 11%  vmstat.procs.r
>       3072 ± 10%     -40.3%       1833 ±  9%  cpuidle.POLL.usage
>    3045115 ±  0%     -46.1%    1642053 ±  1%  proc-vmstat.pgfault
>        202 ±  1%     -45.2%        110 ±  0%  proc-vmstat.nr_inactive_anon
>    4583079 ±  2%     -31.4%    3143602 ± 16%  numa-vmstat.node1.numa_hit
>      28.03 ±  0%     +69.1%      47.39 ±  1%  turbostat.CPU%c6
>        223 ±  1%     -41.1%        131 ±  1%  proc-vmstat.nr_shmem
>    4518820 ±  3%     -30.8%    3128304 ± 16%  numa-vmstat.node1.numa_local
>    3363496 ±  3%     -27.4%    2441619 ± 20%  numa-vmstat.node1.nr_dirtied
>    3345346 ±  3%     -27.4%    2428396 ± 20%  numa-vmstat.node1.nr_written
>       0.18 ± 18%    +105.6%       0.37 ± 36%  turbostat.Pkg%pc3
>    3427913 ±  3%     -27.3%    2492563 ± 20%  numa-vmstat.node1.nr_inactive_file
>   13712431 ±  3%     -27.3%    9971152 ± 20%  numa-meminfo.node1.Inactive
>   13711768 ±  3%     -27.3%    9970866 ± 20%  numa-meminfo.node1.Inactive(file)
>    3444598 ±  3%     -27.2%    2508920 ± 20%  numa-vmstat.node1.nr_file_pages
>   13778510 ±  3%     -27.2%   10036287 ± 20%  numa-meminfo.node1.FilePages
>    8819175 ±  1%     -28.3%    6320188 ± 19%  numa-numastat.node1.numa_hit
>    8819051 ±  1%     -28.3%    6320152 ± 19%  numa-numastat.node1.local_node
>   14350918 ±  3%     -26.8%   10504070 ± 19%  numa-meminfo.node1.MemUsed
>     100892 ±  3%     -26.0%      74623 ± 19%  numa-vmstat.node1.nr_slab_reclaimable
>     403571 ±  3%     -26.0%     298513 ± 19%  numa-meminfo.node1.SReclaimable
>       3525 ± 13%     +36.6%       4817 ± 14%  slabinfo.blkdev_requests.active_objs
>       3552 ± 13%     +36.3%       4841 ± 14%  slabinfo.blkdev_requests.num_objs
>      30779 ±  4%     -34.7%      20084 ± 12%  proc-vmstat.pgmigrate_success
>      30779 ±  4%     -34.7%      20084 ± 12%  proc-vmstat.numa_pages_migrated
>     447400 ±  2%     -23.2%     343701 ± 16%  numa-meminfo.node1.Slab
>  2.532e+10 ±  0%     -33.1%  1.694e+10 ±  1%  cpuidle.C6-IVT.time
>       3081 ±  9%     +28.0%       3945 ± 12%  slabinfo.mnt_cache.num_objs
>       3026 ±  9%     +28.8%       3898 ± 12%  slabinfo.mnt_cache.active_objs
>       5822 ±  4%     +77.8%      10350 ± 25%  numa-meminfo.node1.Writeback
>       1454 ±  4%     +77.3%       2579 ± 25%  numa-vmstat.node1.nr_writeback
>     424984 ±  1%     -26.5%     312255 ±  3%  proc-vmstat.numa_pte_updates
>     368001 ±  1%     -26.8%     269440 ±  3%  proc-vmstat.numa_hint_faults
>     456465 ±  1%     -26.7%     334594 ±  4%  time.minor_page_faults
>       3.86 ±  3%     -24.4%       2.92 ±  2%  turbostat.CPU%c3
>    4661151 ±  2%     +20.6%    5622999 ±  9%  numa-vmstat.node1.nr_free_pages
>   18644452 ±  2%     +20.6%   22491300 ±  9%  numa-meminfo.node1.MemFree
>        876 ±  2%     +28.2%       1124 ±  5%  slabinfo.kmalloc-4096.num_objs
>        858 ±  3%     +24.0%       1064 ±  5%  slabinfo.kmalloc-4096.active_objs
>   17767832 ±  8%     -25.4%   13249545 ± 17%  cpuidle.POLL.time
>     285093 ±  1%     -23.1%     219372 ±  5%  proc-vmstat.numa_hint_faults_local
>     105423 ±  2%     -16.1%      88498 ±  0%  meminfo.Dirty
>      26365 ±  1%     -16.0%      22152 ±  1%  proc-vmstat.nr_dirty
>      41.04 ±  1%     -14.1%      35.26 ±  1%  turbostat.CPU%c1
>       9385 ±  4%     -14.3%       8043 ±  6%  slabinfo.kmalloc-192.active_objs
>       9574 ±  3%     -13.9%       8241 ±  6%  slabinfo.kmalloc-192.num_objs
>       2411 ±  3%     +17.0%       2820 ±  4%  slabinfo.kmalloc-2048.active_objs
>   12595574 ±  0%     -10.0%   11338368 ±  1%  proc-vmstat.pgalloc_normal
>       5262 ±  1%     +13.3%       5962 ±  1%  slabinfo.kmalloc-1024.num_objs
>       5262 ±  1%     +12.7%       5932 ±  1%  slabinfo.kmalloc-1024.active_objs
>       2538 ±  3%     +13.7%       2885 ±  4%  slabinfo.kmalloc-2048.num_objs
>    5299546 ±  0%      -9.9%    4776351 ±  0%  slabinfo.buffer_head.active_objs
>    5299546 ±  0%      -9.9%    4776351 ±  0%  slabinfo.buffer_head.num_objs
>     135885 ±  0%      -9.9%     122470 ±  0%  slabinfo.buffer_head.num_slabs
>     135885 ±  0%      -9.9%     122470 ±  0%  slabinfo.buffer_head.active_slabs
>      28.04 ±  2%    +715.6%     228.69 ±  3%  iostat.sdb.avgrq-sz
>      28.05 ±  2%    +708.1%     226.72 ±  2%  iostat.sdc.avgrq-sz
>       2245 ±  3%     -81.6%        413 ±  1%  iostat.sda.w/s
>       5.33 ±  1%   +1008.2%      59.07 ±  1%  iostat.sda.w_await
>       5.85 ±  1%   +1126.4%      71.69 ±  4%  iostat.sda.r_await
>       5.36 ±  1%    +978.6%      57.79 ±  3%  iostat.sdc.w_await
>       1263 ±  4%     -85.8%        179 ±  6%  iostat.sdc.r/s
>       2257 ±  3%     -81.6%        414 ±  2%  iostat.sdb.w/s
>       1264 ±  4%     -85.8%        179 ±  6%  iostat.sdb.r/s
>       5.55 ±  0%   +1024.2%      62.37 ±  4%  iostat.sdb.await
>       5.89 ±  1%   +1125.9%      72.16 ±  6%  iostat.sdb.r_await
>       5.36 ±  0%   +1014.3%      59.75 ±  3%  iostat.sdb.w_await
>       5.57 ±  1%    +987.9%      60.55 ±  3%  iostat.sdc.await
>       5.51 ±  0%   +1017.3%      61.58 ±  1%  iostat.sda.await
>       1264 ±  4%     -85.8%        179 ±  6%  iostat.sda.r/s
>      28.09 ±  2%    +714.2%     228.73 ±  2%  iostat.sda.avgrq-sz
>       5.95 ±  2%   +1091.0%      70.82 ±  6%  iostat.sdc.r_await
>       2252 ±  3%     -81.5%        417 ±  2%  iostat.sdc.w/s
>       4032 ±  2%    +151.6%      10143 ±  1%  iostat.sdb.wrqm/s
>       4043 ±  2%    +151.0%      10150 ±  1%  iostat.sda.wrqm/s
>       4035 ±  2%    +151.2%      10138 ±  1%  iostat.sdc.wrqm/s
>      26252 ±  1%     -54.0%      12077 ±  4%  vmstat.system.in
>      37813 ±  0%    +101.0%      75998 ±  1%  vmstat.io.bo
>      37789 ±  0%    +101.0%      75945 ±  1%  iostat.md0.wkB/s
>        205 ±  0%     +96.1%        402 ±  1%  iostat.md0.w/s
>     164286 ±  1%     -46.2%      88345 ±  2%  vmstat.system.cs
>      27.07 ±  2%     -46.7%      14.42 ±  3%  turbostat.%Busy
>        810 ±  2%     -46.7%        431 ±  3%  turbostat.Avg_MHz
>      15.56 ±  2%     +71.7%      26.71 ±  1%  iostat.sda.avgqu-sz
>      15.65 ±  2%     +69.1%      26.46 ±  2%  iostat.sdc.avgqu-sz
>      15.67 ±  2%     +72.7%      27.06 ±  2%  iostat.sdb.avgqu-sz
>      25151 ±  0%     +68.3%      42328 ±  1%  iostat.sda.wkB/s
>      25153 ±  0%     +68.2%      42305 ±  1%  iostat.sdb.wkB/s
>      25149 ±  0%     +68.2%      42292 ±  1%  iostat.sdc.wkB/s
>      97.45 ±  0%     -21.1%      76.90 ±  0%  turbostat.CorWatt
>      12517 ±  0%     -20.2%       9994 ±  1%  iostat.sdc.rkB/s
>      12517 ±  0%     -20.0%      10007 ±  1%  iostat.sda.rkB/s
>      12512 ±  0%     -19.9%      10018 ±  1%  iostat.sdb.rkB/s
>       1863 ±  3%     +24.7%       2325 ±  1%  iostat.sdb.rrqm/s
>       1865 ±  3%     +24.3%       2319 ±  1%  iostat.sdc.rrqm/s
>       1864 ±  3%     +24.6%       2322 ±  1%  iostat.sda.rrqm/s
>        128 ±  0%     -16.4%        107 ±  0%  turbostat.PkgWatt
>     150569 ±  0%      -8.7%     137525 ±  0%  iostat.md0.avgqu-sz
>       4.29 ±  0%      -5.1%       4.07 ±  0%  turbostat.RAMWatt
> 
> 
> more detailed changes about ivb44/fsmark/1x-64t-4BRD_12G-RAID5-btrfs-4M-30G-NoSync
> ---------
> 
> 26089f4902595a2f  4400755e356f9a2b0b7ceaa02f  
> ----------------  --------------------------  
>          %stddev     %change         %stddev
>              \          |                \  
>        273 ±  4%     -18.1%        223 ±  6%  fsmark.files_per_sec
>      29.24 ±  1%     +27.2%      37.20 ±  8%  fsmark.time.elapsed_time.max
>      29.24 ±  1%     +27.2%      37.20 ±  8%  fsmark.time.elapsed_time
>        399 ±  4%     -20.0%        319 ±  3%  fsmark.time.percent_of_cpu_this_job_got
>     129891 ± 20%     -28.9%      92334 ± 15%  fsmark.time.voluntary_context_switches
>        266 ±  0%    +413.4%       1365 ±  5%  slabinfo.raid5-md0.num_objs
>        266 ±  0%    +413.4%       1365 ±  5%  slabinfo.raid5-md0.active_objs
>       0.23 ± 27%     +98.6%       0.46 ± 35%  turbostat.CPU%c3
>   56612063 ±  9%     +36.7%   77369763 ± 20%  cpuidle.C1-IVT.time
>    5579498 ± 14%     -36.0%    3571516 ±  6%  cpuidle.C1E-IVT.time
>       4668 ± 38%     +64.7%       7690 ± 19%  numa-vmstat.node0.nr_unevictable
>      18674 ± 38%     +64.7%      30762 ± 19%  numa-meminfo.node0.Unevictable
>       9298 ± 37%     +64.4%      15286 ± 19%  proc-vmstat.nr_unevictable
>       4629 ± 37%     +64.1%       7596 ± 19%  numa-vmstat.node1.nr_unevictable
>      18535 ± 37%     +63.9%      30385 ± 19%  numa-meminfo.node1.Unevictable
>    4270894 ± 19%     +65.6%    7070923 ± 21%  cpuidle.C3-IVT.time
>      38457 ± 37%     +59.0%      61148 ± 19%  meminfo.Unevictable
>    3748226 ± 17%     +26.6%    4743674 ± 16%  numa-vmstat.node0.numa_local
>    4495283 ± 13%     -24.8%    3382315 ± 17%  numa-vmstat.node0.nr_free_pages
>    3818432 ± 16%     +26.5%    4830938 ± 16%  numa-vmstat.node0.numa_hit
>   17966826 ± 13%     -24.7%   13537228 ± 17%  numa-meminfo.node0.MemFree
>   14901309 ± 15%     +29.7%   19330906 ± 12%  numa-meminfo.node0.MemUsed
>         26 ± 21%     -32.9%         17 ± 14%  cpuidle.POLL.usage
>  1.183e+09 ±  1%     +29.6%  1.533e+09 ±  8%  cpuidle.C6-IVT.time
>      29.24 ±  1%     +27.2%      37.20 ±  8%  time.elapsed_time
>      29.24 ±  1%     +27.2%      37.20 ±  8%  time.elapsed_time.max
>        399 ±  4%     -20.0%        319 ±  3%  time.percent_of_cpu_this_job_got
>        850 ±  4%      -8.6%        777 ±  5%  slabinfo.blkdev_requests.num_objs
>        850 ±  4%      -8.6%        777 ±  5%  slabinfo.blkdev_requests.active_objs
>      14986 ±  9%     +17.1%      17548 ±  8%  numa-vmstat.node0.nr_slab_reclaimable
>      11943 ±  5%     -12.6%      10441 ±  2%  slabinfo.kmalloc-192.num_objs
>      59986 ±  9%     +17.0%      70186 ±  8%  numa-meminfo.node0.SReclaimable
>       3703 ±  6%     +10.2%       4082 ±  7%  slabinfo.btrfs_delayed_data_ref.num_objs
>     133551 ±  6%     +16.1%     154995 ±  1%  proc-vmstat.pgfault
>     129891 ± 20%     -28.9%      92334 ± 15%  time.voluntary_context_switches
>      11823 ±  4%     -12.0%      10409 ±  3%  slabinfo.kmalloc-192.active_objs
>       3703 ±  6%      +9.7%       4061 ±  7%  slabinfo.btrfs_delayed_data_ref.active_objs
>      19761 ±  2%     -11.2%      17542 ±  6%  slabinfo.anon_vma.active_objs
>      19761 ±  2%     -11.2%      17544 ±  6%  slabinfo.anon_vma.num_objs
>      13002 ±  3%     +14.9%      14944 ±  5%  slabinfo.kmalloc-256.num_objs
>      12695 ±  3%     +13.8%      14446 ±  7%  slabinfo.kmalloc-256.active_objs
>       1190 ±  1%     -11.8%       1050 ±  3%  slabinfo.mnt_cache.num_objs
>       1190 ±  1%     -11.8%       1050 ±  3%  slabinfo.mnt_cache.active_objs
>     136862 ±  1%     -13.8%     117938 ±  7%  cpuidle.C6-IVT.usage
>    1692630 ±  3%     +12.3%    1900854 ±  0%  numa-vmstat.node0.nr_written
>       1056 ±  2%      +8.8%       1149 ±  3%  slabinfo.mm_struct.active_objs
>       1056 ±  2%      +8.8%       1149 ±  3%  slabinfo.mm_struct.num_objs
>      24029 ± 11%     -30.6%      16673 ±  8%  vmstat.system.cs
>       8859 ±  2%     -15.0%       7530 ±  8%  vmstat.system.in
>     905630 ±  2%     -16.8%     753097 ±  4%  iostat.md0.wkB/s
>     906433 ±  2%     -16.9%     753482 ±  4%  vmstat.io.bo
>       3591 ±  2%     -16.9%       2982 ±  4%  iostat.md0.w/s
>      13.22 ±  5%     -16.3%      11.07 ±  1%  turbostat.%Busy
>        402 ±  4%     -15.9%        338 ±  1%  turbostat.Avg_MHz
>      54236 ±  3%     +10.4%      59889 ±  4%  iostat.md0.avgqu-sz
>       7.67 ±  1%      +4.5%       8.01 ±  1%  turbostat.RAMWatt
> 
> 
> 
> 	--yliu
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: performance changes on 4400755e:  200.0% fsmark.files_per_sec, -18.1% fsmark.files_per_sec, and few more
  2015-03-25  3:03 ` NeilBrown
@ 2015-03-26  4:30   ` Yuanhan Liu
  0 siblings, 0 replies; 3+ messages in thread
From: Yuanhan Liu @ 2015-03-26  4:30 UTC (permalink / raw)
  To: NeilBrown; +Cc: lkp, LKML, Chris Mason, Yuanhan Liu

On Wed, Mar 25, 2015 at 02:03:59PM +1100, NeilBrown wrote:
> On Wed, 18 Mar 2015 13:00:30 +0800 Yuanahn Liu <yuanhan.liu@linux.intel.com>
> wrote:
> 
> > Hi,
> > 
> > FYI, we noticed performance changes on `fsmark.files_per_sec' by 4400755e356f9a2b0b7ceaa02f57b1c7546c3765:
> > 
> >     > commit 4400755e356f9a2b0b7ceaa02f57b1c7546c3765
> >     > Author:     NeilBrown <neilb@suse.de>
> >     > AuthorDate: Thu Feb 26 12:47:56 2015 +1100
> >     > Commit:     NeilBrown <neilb@suse.de>
> >     > CommitDate: Wed Mar 4 13:40:19 2015 +1100
> >     > 
> >     >     md/raid5: allow the stripe_cache to grow and shrink.
> 
> Thanks a lot for this testing!!! I was wondering how I could do some proper
> testing of this patch, and you've done it for me :-)

Welcome!

> 
> The large number of improvements is very encouraging - that is what I was
> hoping for of course.
> 
> The few regressions could be a concern.  I note that are all NoSync.
> That seems to suggest that they could just be writing more data.

It's not a time based test, but size based test:

> >       40G, 30G, 120G: means the total test size

Hence, I doubt it might be writing more data.


> i.e. the data is written a bit earlier (certainly possible) so it happen to
> introduce more delay ....
> 
> I guess I'm not really sure how to interpret NoSync results, and suspect that
> poor NoSync result don't really reflect much on the underlying block device.
> Could that be right?

Sorry, I'm not quite sure I followed you. Poor NoSync result? Do you
mean the small number like 63.133, 57.600? They are of unit of
files_per_sec, and file size is 4M. Hence, it would be 200+ MB/s, which
is not that bad in this case, as it's a 3 hard disk RAID5.

> > 3   8.1               63.133     ±0.5%       3   9.2               55.633     ±0.2%         -11.9% ivb44/fsmark/1x-1t-3HDD-RAID5-btrfs-4M-120G-NoSync

Here are few iostat sample from 26089f4902595a2f64c512066af07af6e82eb096
of above test:

    avg-cpu:  %user   %nice %system %iowait  %steal   %idle
               0.00    0.00    0.63    1.67    0.00   97.70
    
    Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
    sdb               0.00 30353.00    0.00  240.00     0.00 121860.00  1015.50     1.29    5.35    0.00    5.35   3.50  83.90
    sdc               0.00 30353.00    0.00  241.00     0.00 122372.00  1015.54     0.66    2.74    0.00    2.74   2.53  60.90
    sda               0.00 30353.00    0.00  242.00     0.00 122884.00  1015.57     1.29    5.36    0.00    5.36   3.52  85.20
    md0               0.00     0.00    0.00  956.00     0.00 244736.00   512.00 227231.39    0.00    0.00    0.00   1.05 100.00
    
    avg-cpu:  %user   %nice %system %iowait  %steal   %idle
               0.02    0.00    0.69    1.69    0.00   97.60
    
    Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
    sdb               0.00 30988.00    0.00  247.00     0.00 125444.00  1015.74     1.77    7.17    0.00    7.17   4.02  99.40
    sdc               0.00 30988.00    0.00  245.00     0.00 124420.00  1015.67     1.19    4.82    0.00    4.82   3.67  89.90
    sda               0.00 30988.00    0.00  247.00     0.00 125444.00  1015.74     0.65    2.65    0.00    2.65   2.54  62.70
    md0               0.00     0.00    0.00  976.00     0.00 249856.00   512.00 228206.37    0.00    0.00    0.00   1.02 100.00
    
    avg-cpu:  %user   %nice %system %iowait  %steal   %idle
               0.00    0.00    0.61    1.67    0.00   97.72
    
    Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
    sdb               0.00 29718.00    0.00  235.00     0.00 119300.00  1015.32     1.35    5.71    0.00    5.71   3.71  87.20
    sdc               0.00 29718.00    0.00  236.00     0.00 119812.00  1015.36     1.19    5.06    0.00    5.06   3.43  80.90
    sda               0.00 29718.00    0.00  235.00     0.00 119300.00  1015.32     0.87    3.69    0.00    3.69   2.99  70.20
    md0               0.00     0.00    0.00  936.00     0.00 239616.00   512.00 229157.33    0.00    0.00    0.00   1.07 100.00


And few iostat sample of 4400755e356f9a2b0b7ceaa02f57b1c7546c3765(first bad commit):

    avg-cpu:  %user   %nice %system %iowait  %steal   %idle
               0.02    0.00    1.09    1.54    0.00   97.35
    
    Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
    sdb               1.00 27677.00    1.00  206.00     8.00 100516.00   971.25    27.40  130.56  196.00  130.24   4.72  97.70
    sdc               0.00 27677.00    0.00  207.00     0.00 101028.00   976.12    27.05  129.43    0.00  129.43   4.61  95.50
    sda               5.00 27677.00    1.00  211.00    16.00 102984.00   971.70    26.61  127.00  201.00  126.64   4.50  95.50
    md0               0.00     0.00    0.00  824.00     0.00 210944.00   512.00 224122.02    0.00    0.00    0.00   1.21 100.00
    
    avg-cpu:  %user   %nice %system %iowait  %steal   %idle
               0.00    0.00    0.98    1.54    0.00   97.47
    
    Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
    sdb               3.00 21203.00    1.00  218.00    16.00 107060.00   977.86    30.44  147.77  198.00  147.54   4.53  99.10
    sdc               2.00 21203.00    2.00  220.00    16.00 108592.00   978.45    31.12  150.65  208.00  150.13   4.43  98.40
    sda               0.00 21203.00    1.00  220.00    24.00 108020.00   977.77    30.56  150.88  197.00  150.67   4.38  96.80
    md0               0.00     0.00    0.00  720.00     0.00 184320.00   512.00 224963.92    0.00    0.00    0.00   1.39 100.00
    
    avg-cpu:  %user   %nice %system %iowait  %steal   %idle
               0.02    0.00    0.96    1.63    0.00   97.39
    
    Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
    sdb              11.00 29455.00    3.00  213.00    56.00 102958.00   953.83    31.19  134.97  205.00  133.99   4.56  98.40
    sdc               0.00 29454.00    0.00  210.00     0.00 99890.00   951.33    29.36  127.07    0.00  127.07   4.36  91.60
    sda               1.00 29454.00    0.00  215.00     0.00 103534.00   963.11    27.54  117.54    0.00  117.54   4.26  91.60
    md0               0.00     0.00    0.00  876.00     0.00 224256.00   512.00 225993.60    0.00    0.00    0.00   1.14 100.10



> > 3   0.5              273.100     ±4.3%       3   0.6              223.567     ±6.5%         -18.1% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-btrfs-4M-30G-NoSync
> > 3   8.1               63.133     ±0.5%       3   9.2               55.633     ±0.2%         -11.9% ivb44/fsmark/1x-1t-3HDD-RAID5-btrfs-4M-120G-NoSync
> > 3   8.2               64.000     ±0.0%       3   9.2               57.600     ±0.0%         -10.0% ivb44/fsmark/1x-64t-3HDD-RAID5-btrfs-4M-120G-NoSync
> 
> Also, I'm a little confused by the
>    fsmark.time.involuntary_context_switches
> statistic:
> 
> >       1235 ±  2%     -47.5%        649 ±  3%  fsmark.time.percent_of_cpu_this_job_got
> 
> >        399 ±  4%     -20.0%        319 ±  3%  fsmark.time.percent_of_cpu_this_job_got
> 
> Does that means that the ext4 test changed from 12.4 cpus to 6.4, and that
> the btrfs test chnages from 4 cpus to 3.2 ???

fsmark.time.percent_of_cpu_this_job_got is output from /usr/bin/time, which
is from gnu time package. There is the explanation from source code:

    *  P == percent of CPU this job got (total cpu time / elapsed time)


	--yliu

> 
> Or does it just not mean anything?
> 
> Thanks,
> NeilBrown
> 
> 
> 
> 
> > 
> > 26089f4902595a2f64c512066af07af6e82eb096     4400755e356f9a2b0b7ceaa02f57b1c7546c3765
> > ----------------------------------------     ----------------------------------------
> > run time(m)     metric_value     ±stddev     run time(m)     metric_value     ±stddev     change   testbox/benchmark/sub-testcase
> > --- ------  ----------------------------     --- ------  ----------------------------     -------- ------------------------------
> > 3   18.6               6.400     ±0.0%       5   9.2               19.200     ±0.0%         200.0% ivb44/fsmark/1x-64t-3HDD-RAID5-ext4-4M-40G-fsyncBeforeClose
> > 3   24.7               6.400     ±0.0%       3   13.7              12.800     ±0.0%         100.0% ivb44/fsmark/1x-64t-3HDD-RAID5-f2fs-4M-40G-fsyncBeforeClose
> > 3   17.5              28.267     ±9.6%       3   12.3              42.833     ±6.5%          51.5% ivb44/fsmark/1x-64t-3HDD-RAID5-f2fs-4M-120G-NoSync
> > 3   16.7              30.700     ±1.5%       3   12.6              40.733     ±2.4%          32.7% ivb44/fsmark/1x-1t-3HDD-RAID5-f2fs-4M-120G-NoSync
> > 3   29.0               5.867     ±0.8%       5   23.6               7.240     ±0.7%          23.4% ivb44/fsmark/1x-1t-3HDD-RAID5-btrfs-4M-40G-fsyncBeforeClose
> > 3   28.5               6.000     ±0.0%       3   23.2               7.367     ±0.6%          22.8% ivb44/fsmark/1x-1t-3HDD-RAID5-f2fs-4M-40G-fsyncBeforeClose
> > 5   11.7              14.600     ±0.0%       5   9.7               17.500     ±0.4%          19.9% ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-40G-fsyncBeforeClose
> > 3   22.4              25.600     ±0.0%       5   17.9              30.120     ±4.1%          17.7% ivb44/fsmark/1x-64t-3HDD-RAID5-xfs-4M-120G-NoSync
> > 5   10.8              47.320     ±0.6%       5   9.3               54.820     ±0.2%          15.8% ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-120G-NoSync
> > 1   0.5              252.400     ±0.0%       1   0.5              263.300     ±0.0%           4.3% ivb44/fsmark/1x-1t-4BRD_12G-RAID5-ext4-4M-30G-NoSync
> > 
> > 3   0.5              273.100     ±4.3%       3   0.6              223.567     ±6.5%         -18.1% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-btrfs-4M-30G-NoSync
> > 3   8.1               63.133     ±0.5%       3   9.2               55.633     ±0.2%         -11.9% ivb44/fsmark/1x-1t-3HDD-RAID5-btrfs-4M-120G-NoSync
> > 3   8.2               64.000     ±0.0%       3   9.2               57.600     ±0.0%         -10.0% ivb44/fsmark/1x-64t-3HDD-RAID5-btrfs-4M-120G-NoSync
> > 
> > 
> > NOTE: here are some more info about those test parameters for you to
> >       understand the testcase better:
> > 
> >       1x: where 'x' means iterations or loop, corresponding to the 'L' option of fsmark
> >       1t, 64t: where 't' means thread
> >       4M: means the single file size, corresponding to the '-s' option of fsmark
> >       40G, 30G, 120G: means the total test size
> > 
> >       4BRD_12G: BRD is the ramdisk, where '4' means 4 ramdisk, and where '12G' means
> >                 the size of one ramdisk. So, it would be 48G in total. And we made a
> > 		raid on those ramdisk.
> > 
> > 
> > As you can see from above data, interestingly, all performance
> > regressions come from btrfs testing. That's why Chris is also
> > in the cc list, with which just FYI.
> > 
> > 
> > FYI, here I listed more detailed changes for the maximal postive and negtive changes.
> > 
> > more detailed changes about ivb44/fsmark/1x-64t-3HDD-RAID5-ext4-4M-40G-fsyncBeforeClose
> > ---------
> > 
> > 26089f4902595a2f  4400755e356f9a2b0b7ceaa02f  
> > ----------------  --------------------------  
> >          %stddev     %change         %stddev
> >              \          |                \  
> >       6.40 ±  0%    +200.0%      19.20 ±  0%  fsmark.files_per_sec
> >  1.015e+08 ±  1%     -73.6%   26767355 ±  3%  fsmark.time.voluntary_context_switches
> >      13793 ±  1%     -73.9%       3603 ±  5%  fsmark.time.system_time
> >      78473 ±  6%     -64.3%      28016 ±  7%  fsmark.time.involuntary_context_switches
> >   15789555 ±  9%     -54.7%    7159485 ± 13%  fsmark.app_overhead
> >       1115 ±  0%     -50.3%        554 ±  1%  fsmark.time.elapsed_time.max
> >       1115 ±  0%     -50.3%        554 ±  1%  fsmark.time.elapsed_time
> >       1235 ±  2%     -47.5%        649 ±  3%  fsmark.time.percent_of_cpu_this_job_got
> >     456465 ±  1%     -26.7%     334594 ±  4%  fsmark.time.minor_page_faults
> >        275 ±  0%   +1257.7%       3733 ±  2%  slabinfo.raid5-md0.num_objs
> >        275 ±  0%   +1257.7%       3733 ±  2%  slabinfo.raid5-md0.active_objs
> >         11 ±  0%   +1250.9%        148 ±  2%  slabinfo.raid5-md0.active_slabs
> >         11 ±  0%   +1250.9%        148 ±  2%  slabinfo.raid5-md0.num_slabs
> >       2407 ±  4%    +293.4%       9471 ± 26%  numa-meminfo.node0.Writeback
> >        600 ±  4%    +294.9%       2372 ± 26%  numa-vmstat.node0.nr_writeback
> >    1114505 ±  0%     -77.4%     251696 ±  2%  softirqs.TASKLET
> >    1808027 ±  1%     -77.7%     402378 ±  4%  softirqs.RCU
> >   12158665 ±  1%     -77.1%    2786069 ±  4%  cpuidle.C3-IVT.usage
> >    1119433 ±  0%     -77.3%     254192 ±  2%  softirqs.BLOCK
> >   37824202 ±  1%     -75.1%    9405078 ±  4%  cpuidle.C6-IVT.usage
> >  1.015e+08 ±  1%     -73.6%   26767355 ±  3%  time.voluntary_context_switches
> >      13793 ±  1%     -73.9%       3603 ±  5%  time.system_time
> >    5971084 ±  1%     -73.6%    1574912 ±  5%  softirqs.SCHED
> >   10539492 ±  3%     -72.0%    2956258 ±  6%  cpuidle.C1E-IVT.usage
> >          2 ±  0%    +230.0%          6 ± 12%  vmstat.procs.b
> >      14064 ±  1%     -71.2%       4049 ±  6%  softirqs.HRTIMER
> >    7388306 ±  1%     -71.2%    2129929 ±  4%  softirqs.TIMER
> >  3.496e+09 ±  1%     -70.3%   1.04e+09 ±  1%  cpuidle.C3-IVT.time
> >       0.88 ±  6%    +224.9%       2.87 ± 11%  turbostat.Pkg%pc6
> >   19969464 ±  2%     -66.2%    6750675 ±  5%  cpuidle.C1-IVT.usage
> >      78473 ±  6%     -64.3%      28016 ±  7%  time.involuntary_context_switches
> >       4.23 ±  5%    +181.4%      11.90 ±  3%  turbostat.Pkg%pc2
> >  2.551e+09 ±  1%     -61.4%  9.837e+08 ±  3%  cpuidle.C1E-IVT.time
> >       8084 ±  3%    +142.6%      19608 ±  3%  meminfo.Writeback
> >       2026 ±  4%    +141.6%       4895 ±  4%  proc-vmstat.nr_writeback
> >        165 ±  4%     -56.9%         71 ± 14%  numa-vmstat.node1.nr_inactive_anon
> >  7.748e+09 ±  3%     -50.3%  3.852e+09 ±  3%  cpuidle.C1-IVT.time
> >        175 ±  5%     -53.2%         82 ± 13%  numa-vmstat.node1.nr_shmem
> >       1115 ±  0%     -50.3%        554 ±  1%  time.elapsed_time.max
> >       1115 ±  0%     -50.3%        554 ±  1%  time.elapsed_time
> >       1147 ±  0%     -49.0%        585 ±  1%  uptime.boot
> >    2260889 ±  0%     -48.8%    1157272 ±  1%  proc-vmstat.pgfree
> >      16805 ±  2%     -35.9%      10776 ± 23%  numa-vmstat.node1.nr_dirty
> >       1235 ±  2%     -47.5%        649 ±  3%  time.percent_of_cpu_this_job_got
> >      67245 ±  2%     -35.9%      43122 ± 23%  numa-meminfo.node1.Dirty
> >      39041 ±  0%     -45.7%      21212 ±  2%  uptime.idle
> >         13 ±  9%     -49.0%          6 ± 11%  vmstat.procs.r
> >       3072 ± 10%     -40.3%       1833 ±  9%  cpuidle.POLL.usage
> >    3045115 ±  0%     -46.1%    1642053 ±  1%  proc-vmstat.pgfault
> >        202 ±  1%     -45.2%        110 ±  0%  proc-vmstat.nr_inactive_anon
> >    4583079 ±  2%     -31.4%    3143602 ± 16%  numa-vmstat.node1.numa_hit
> >      28.03 ±  0%     +69.1%      47.39 ±  1%  turbostat.CPU%c6
> >        223 ±  1%     -41.1%        131 ±  1%  proc-vmstat.nr_shmem
> >    4518820 ±  3%     -30.8%    3128304 ± 16%  numa-vmstat.node1.numa_local
> >    3363496 ±  3%     -27.4%    2441619 ± 20%  numa-vmstat.node1.nr_dirtied
> >    3345346 ±  3%     -27.4%    2428396 ± 20%  numa-vmstat.node1.nr_written
> >       0.18 ± 18%    +105.6%       0.37 ± 36%  turbostat.Pkg%pc3
> >    3427913 ±  3%     -27.3%    2492563 ± 20%  numa-vmstat.node1.nr_inactive_file
> >   13712431 ±  3%     -27.3%    9971152 ± 20%  numa-meminfo.node1.Inactive
> >   13711768 ±  3%     -27.3%    9970866 ± 20%  numa-meminfo.node1.Inactive(file)
> >    3444598 ±  3%     -27.2%    2508920 ± 20%  numa-vmstat.node1.nr_file_pages
> >   13778510 ±  3%     -27.2%   10036287 ± 20%  numa-meminfo.node1.FilePages
> >    8819175 ±  1%     -28.3%    6320188 ± 19%  numa-numastat.node1.numa_hit
> >    8819051 ±  1%     -28.3%    6320152 ± 19%  numa-numastat.node1.local_node
> >   14350918 ±  3%     -26.8%   10504070 ± 19%  numa-meminfo.node1.MemUsed
> >     100892 ±  3%     -26.0%      74623 ± 19%  numa-vmstat.node1.nr_slab_reclaimable
> >     403571 ±  3%     -26.0%     298513 ± 19%  numa-meminfo.node1.SReclaimable
> >       3525 ± 13%     +36.6%       4817 ± 14%  slabinfo.blkdev_requests.active_objs
> >       3552 ± 13%     +36.3%       4841 ± 14%  slabinfo.blkdev_requests.num_objs
> >      30779 ±  4%     -34.7%      20084 ± 12%  proc-vmstat.pgmigrate_success
> >      30779 ±  4%     -34.7%      20084 ± 12%  proc-vmstat.numa_pages_migrated
> >     447400 ±  2%     -23.2%     343701 ± 16%  numa-meminfo.node1.Slab
> >  2.532e+10 ±  0%     -33.1%  1.694e+10 ±  1%  cpuidle.C6-IVT.time
> >       3081 ±  9%     +28.0%       3945 ± 12%  slabinfo.mnt_cache.num_objs
> >       3026 ±  9%     +28.8%       3898 ± 12%  slabinfo.mnt_cache.active_objs
> >       5822 ±  4%     +77.8%      10350 ± 25%  numa-meminfo.node1.Writeback
> >       1454 ±  4%     +77.3%       2579 ± 25%  numa-vmstat.node1.nr_writeback
> >     424984 ±  1%     -26.5%     312255 ±  3%  proc-vmstat.numa_pte_updates
> >     368001 ±  1%     -26.8%     269440 ±  3%  proc-vmstat.numa_hint_faults
> >     456465 ±  1%     -26.7%     334594 ±  4%  time.minor_page_faults
> >       3.86 ±  3%     -24.4%       2.92 ±  2%  turbostat.CPU%c3
> >    4661151 ±  2%     +20.6%    5622999 ±  9%  numa-vmstat.node1.nr_free_pages
> >   18644452 ±  2%     +20.6%   22491300 ±  9%  numa-meminfo.node1.MemFree
> >        876 ±  2%     +28.2%       1124 ±  5%  slabinfo.kmalloc-4096.num_objs
> >        858 ±  3%     +24.0%       1064 ±  5%  slabinfo.kmalloc-4096.active_objs
> >   17767832 ±  8%     -25.4%   13249545 ± 17%  cpuidle.POLL.time
> >     285093 ±  1%     -23.1%     219372 ±  5%  proc-vmstat.numa_hint_faults_local
> >     105423 ±  2%     -16.1%      88498 ±  0%  meminfo.Dirty
> >      26365 ±  1%     -16.0%      22152 ±  1%  proc-vmstat.nr_dirty
> >      41.04 ±  1%     -14.1%      35.26 ±  1%  turbostat.CPU%c1
> >       9385 ±  4%     -14.3%       8043 ±  6%  slabinfo.kmalloc-192.active_objs
> >       9574 ±  3%     -13.9%       8241 ±  6%  slabinfo.kmalloc-192.num_objs
> >       2411 ±  3%     +17.0%       2820 ±  4%  slabinfo.kmalloc-2048.active_objs
> >   12595574 ±  0%     -10.0%   11338368 ±  1%  proc-vmstat.pgalloc_normal
> >       5262 ±  1%     +13.3%       5962 ±  1%  slabinfo.kmalloc-1024.num_objs
> >       5262 ±  1%     +12.7%       5932 ±  1%  slabinfo.kmalloc-1024.active_objs
> >       2538 ±  3%     +13.7%       2885 ±  4%  slabinfo.kmalloc-2048.num_objs
> >    5299546 ±  0%      -9.9%    4776351 ±  0%  slabinfo.buffer_head.active_objs
> >    5299546 ±  0%      -9.9%    4776351 ±  0%  slabinfo.buffer_head.num_objs
> >     135885 ±  0%      -9.9%     122470 ±  0%  slabinfo.buffer_head.num_slabs
> >     135885 ±  0%      -9.9%     122470 ±  0%  slabinfo.buffer_head.active_slabs
> >      28.04 ±  2%    +715.6%     228.69 ±  3%  iostat.sdb.avgrq-sz
> >      28.05 ±  2%    +708.1%     226.72 ±  2%  iostat.sdc.avgrq-sz
> >       2245 ±  3%     -81.6%        413 ±  1%  iostat.sda.w/s
> >       5.33 ±  1%   +1008.2%      59.07 ±  1%  iostat.sda.w_await
> >       5.85 ±  1%   +1126.4%      71.69 ±  4%  iostat.sda.r_await
> >       5.36 ±  1%    +978.6%      57.79 ±  3%  iostat.sdc.w_await
> >       1263 ±  4%     -85.8%        179 ±  6%  iostat.sdc.r/s
> >       2257 ±  3%     -81.6%        414 ±  2%  iostat.sdb.w/s
> >       1264 ±  4%     -85.8%        179 ±  6%  iostat.sdb.r/s
> >       5.55 ±  0%   +1024.2%      62.37 ±  4%  iostat.sdb.await
> >       5.89 ±  1%   +1125.9%      72.16 ±  6%  iostat.sdb.r_await
> >       5.36 ±  0%   +1014.3%      59.75 ±  3%  iostat.sdb.w_await
> >       5.57 ±  1%    +987.9%      60.55 ±  3%  iostat.sdc.await
> >       5.51 ±  0%   +1017.3%      61.58 ±  1%  iostat.sda.await
> >       1264 ±  4%     -85.8%        179 ±  6%  iostat.sda.r/s
> >      28.09 ±  2%    +714.2%     228.73 ±  2%  iostat.sda.avgrq-sz
> >       5.95 ±  2%   +1091.0%      70.82 ±  6%  iostat.sdc.r_await
> >       2252 ±  3%     -81.5%        417 ±  2%  iostat.sdc.w/s
> >       4032 ±  2%    +151.6%      10143 ±  1%  iostat.sdb.wrqm/s
> >       4043 ±  2%    +151.0%      10150 ±  1%  iostat.sda.wrqm/s
> >       4035 ±  2%    +151.2%      10138 ±  1%  iostat.sdc.wrqm/s
> >      26252 ±  1%     -54.0%      12077 ±  4%  vmstat.system.in
> >      37813 ±  0%    +101.0%      75998 ±  1%  vmstat.io.bo
> >      37789 ±  0%    +101.0%      75945 ±  1%  iostat.md0.wkB/s
> >        205 ±  0%     +96.1%        402 ±  1%  iostat.md0.w/s
> >     164286 ±  1%     -46.2%      88345 ±  2%  vmstat.system.cs
> >      27.07 ±  2%     -46.7%      14.42 ±  3%  turbostat.%Busy
> >        810 ±  2%     -46.7%        431 ±  3%  turbostat.Avg_MHz
> >      15.56 ±  2%     +71.7%      26.71 ±  1%  iostat.sda.avgqu-sz
> >      15.65 ±  2%     +69.1%      26.46 ±  2%  iostat.sdc.avgqu-sz
> >      15.67 ±  2%     +72.7%      27.06 ±  2%  iostat.sdb.avgqu-sz
> >      25151 ±  0%     +68.3%      42328 ±  1%  iostat.sda.wkB/s
> >      25153 ±  0%     +68.2%      42305 ±  1%  iostat.sdb.wkB/s
> >      25149 ±  0%     +68.2%      42292 ±  1%  iostat.sdc.wkB/s
> >      97.45 ±  0%     -21.1%      76.90 ±  0%  turbostat.CorWatt
> >      12517 ±  0%     -20.2%       9994 ±  1%  iostat.sdc.rkB/s
> >      12517 ±  0%     -20.0%      10007 ±  1%  iostat.sda.rkB/s
> >      12512 ±  0%     -19.9%      10018 ±  1%  iostat.sdb.rkB/s
> >       1863 ±  3%     +24.7%       2325 ±  1%  iostat.sdb.rrqm/s
> >       1865 ±  3%     +24.3%       2319 ±  1%  iostat.sdc.rrqm/s
> >       1864 ±  3%     +24.6%       2322 ±  1%  iostat.sda.rrqm/s
> >        128 ±  0%     -16.4%        107 ±  0%  turbostat.PkgWatt
> >     150569 ±  0%      -8.7%     137525 ±  0%  iostat.md0.avgqu-sz
> >       4.29 ±  0%      -5.1%       4.07 ±  0%  turbostat.RAMWatt
> > 
> > 
> > more detailed changes about ivb44/fsmark/1x-64t-4BRD_12G-RAID5-btrfs-4M-30G-NoSync
> > ---------
> > 
> > 26089f4902595a2f  4400755e356f9a2b0b7ceaa02f  
> > ----------------  --------------------------  
> >          %stddev     %change         %stddev
> >              \          |                \  
> >        273 ±  4%     -18.1%        223 ±  6%  fsmark.files_per_sec
> >      29.24 ±  1%     +27.2%      37.20 ±  8%  fsmark.time.elapsed_time.max
> >      29.24 ±  1%     +27.2%      37.20 ±  8%  fsmark.time.elapsed_time
> >        399 ±  4%     -20.0%        319 ±  3%  fsmark.time.percent_of_cpu_this_job_got
> >     129891 ± 20%     -28.9%      92334 ± 15%  fsmark.time.voluntary_context_switches
> >        266 ±  0%    +413.4%       1365 ±  5%  slabinfo.raid5-md0.num_objs
> >        266 ±  0%    +413.4%       1365 ±  5%  slabinfo.raid5-md0.active_objs
> >       0.23 ± 27%     +98.6%       0.46 ± 35%  turbostat.CPU%c3
> >   56612063 ±  9%     +36.7%   77369763 ± 20%  cpuidle.C1-IVT.time
> >    5579498 ± 14%     -36.0%    3571516 ±  6%  cpuidle.C1E-IVT.time
> >       4668 ± 38%     +64.7%       7690 ± 19%  numa-vmstat.node0.nr_unevictable
> >      18674 ± 38%     +64.7%      30762 ± 19%  numa-meminfo.node0.Unevictable
> >       9298 ± 37%     +64.4%      15286 ± 19%  proc-vmstat.nr_unevictable
> >       4629 ± 37%     +64.1%       7596 ± 19%  numa-vmstat.node1.nr_unevictable
> >      18535 ± 37%     +63.9%      30385 ± 19%  numa-meminfo.node1.Unevictable
> >    4270894 ± 19%     +65.6%    7070923 ± 21%  cpuidle.C3-IVT.time
> >      38457 ± 37%     +59.0%      61148 ± 19%  meminfo.Unevictable
> >    3748226 ± 17%     +26.6%    4743674 ± 16%  numa-vmstat.node0.numa_local
> >    4495283 ± 13%     -24.8%    3382315 ± 17%  numa-vmstat.node0.nr_free_pages
> >    3818432 ± 16%     +26.5%    4830938 ± 16%  numa-vmstat.node0.numa_hit
> >   17966826 ± 13%     -24.7%   13537228 ± 17%  numa-meminfo.node0.MemFree
> >   14901309 ± 15%     +29.7%   19330906 ± 12%  numa-meminfo.node0.MemUsed
> >         26 ± 21%     -32.9%         17 ± 14%  cpuidle.POLL.usage
> >  1.183e+09 ±  1%     +29.6%  1.533e+09 ±  8%  cpuidle.C6-IVT.time
> >      29.24 ±  1%     +27.2%      37.20 ±  8%  time.elapsed_time
> >      29.24 ±  1%     +27.2%      37.20 ±  8%  time.elapsed_time.max
> >        399 ±  4%     -20.0%        319 ±  3%  time.percent_of_cpu_this_job_got
> >        850 ±  4%      -8.6%        777 ±  5%  slabinfo.blkdev_requests.num_objs
> >        850 ±  4%      -8.6%        777 ±  5%  slabinfo.blkdev_requests.active_objs
> >      14986 ±  9%     +17.1%      17548 ±  8%  numa-vmstat.node0.nr_slab_reclaimable
> >      11943 ±  5%     -12.6%      10441 ±  2%  slabinfo.kmalloc-192.num_objs
> >      59986 ±  9%     +17.0%      70186 ±  8%  numa-meminfo.node0.SReclaimable
> >       3703 ±  6%     +10.2%       4082 ±  7%  slabinfo.btrfs_delayed_data_ref.num_objs
> >     133551 ±  6%     +16.1%     154995 ±  1%  proc-vmstat.pgfault
> >     129891 ± 20%     -28.9%      92334 ± 15%  time.voluntary_context_switches
> >      11823 ±  4%     -12.0%      10409 ±  3%  slabinfo.kmalloc-192.active_objs
> >       3703 ±  6%      +9.7%       4061 ±  7%  slabinfo.btrfs_delayed_data_ref.active_objs
> >      19761 ±  2%     -11.2%      17542 ±  6%  slabinfo.anon_vma.active_objs
> >      19761 ±  2%     -11.2%      17544 ±  6%  slabinfo.anon_vma.num_objs
> >      13002 ±  3%     +14.9%      14944 ±  5%  slabinfo.kmalloc-256.num_objs
> >      12695 ±  3%     +13.8%      14446 ±  7%  slabinfo.kmalloc-256.active_objs
> >       1190 ±  1%     -11.8%       1050 ±  3%  slabinfo.mnt_cache.num_objs
> >       1190 ±  1%     -11.8%       1050 ±  3%  slabinfo.mnt_cache.active_objs
> >     136862 ±  1%     -13.8%     117938 ±  7%  cpuidle.C6-IVT.usage
> >    1692630 ±  3%     +12.3%    1900854 ±  0%  numa-vmstat.node0.nr_written
> >       1056 ±  2%      +8.8%       1149 ±  3%  slabinfo.mm_struct.active_objs
> >       1056 ±  2%      +8.8%       1149 ±  3%  slabinfo.mm_struct.num_objs
> >      24029 ± 11%     -30.6%      16673 ±  8%  vmstat.system.cs
> >       8859 ±  2%     -15.0%       7530 ±  8%  vmstat.system.in
> >     905630 ±  2%     -16.8%     753097 ±  4%  iostat.md0.wkB/s
> >     906433 ±  2%     -16.9%     753482 ±  4%  vmstat.io.bo
> >       3591 ±  2%     -16.9%       2982 ±  4%  iostat.md0.w/s
> >      13.22 ±  5%     -16.3%      11.07 ±  1%  turbostat.%Busy
> >        402 ±  4%     -15.9%        338 ±  1%  turbostat.Avg_MHz
> >      54236 ±  3%     +10.4%      59889 ±  4%  iostat.md0.avgqu-sz
> >       7.67 ±  1%      +4.5%       8.01 ±  1%  turbostat.RAMWatt
> > 
> > 
> > 
> > 	--yliu
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> 



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-03-26  4:28 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-03-18  5:00 performance changes on 4400755e: 200.0% fsmark.files_per_sec, -18.1% fsmark.files_per_sec, and few more Yuanahn Liu
2015-03-25  3:03 ` NeilBrown
2015-03-26  4:30   ` Yuanhan Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox