All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yuanhan Liu <yuanhan.liu@linux.intel.com>
To: lkp@lists.01.org
Subject: performance changes on c9dc4c65: 9.8% fsmark.files_per_sec
Date: Thu, 23 Apr 2015 09:38:01 +0800	[thread overview]
Message-ID: <20150423013801.GP8084@yliu-dev.sh.intel.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 8535 bytes --]

FYI, we found performance increasement, which is expected as commit patch says,
on `fsmark.files_per_sec' by c9dc4c6578502c2085705347375b82089aad18d0:

    > commit c9dc4c6578502c2085705347375b82089aad18d0
    > Author:     Chris Mason <clm@fb.com>
    > AuthorDate: Sat Apr 4 17:14:42 2015 -0700
    > Commit:     Chris Mason <clm@fb.com>
    > CommitDate: Fri Apr 10 14:07:11 2015 -0700
    > 
    >     Btrfs: two stage dirty block group writeout

4c6d1d85ad89fd8e32dc9204b7f944854399bda9     c9dc4c6578502c2085705347375b82089aad18d0
----------------------------------------     ----------------------------------------
run time(m)     metric_value     ±stddev     run time(m)     metric_value     ±stddev     change   testbox/benchmark/testcase-params
--- ------  ----------------------------     --- ------  ----------------------------     -------- ------------------------------
3   7.3              |35.267|    ±0.5        5   6.6              |38.740|    ±1.6            9.8% ivb44/fsmark/1x-1t-1HDD-btrfs-4M-60G-NoSync


NOTE: here are some more explanation about those test parameters for you to
      know what the testcase does better:

      1x: where 'x' means iterations or loop, corresponding to the 'L' option of fsmark

      1t, 64t: where 't' means thread

      4M: means the single file size, corresponding to the '-s' option of fsmark
      60G: means the total test size


And FYI, here are more changes by the same commit:

4c6d1d85ad89fd8e  c9dc4c6578502c208570534737  
----------------  --------------------------  
         %stddev     %change         %stddev
             \          |                \  
      9864 ±  2%    +156.9%      25345 ±  4%  fsmark.time.voluntary_context_switches
         9 ±  0%     +17.8%         10 ±  4%  fsmark.time.percent_of_cpu_this_job_got
    462211 ±  1%     +16.8%     539707 ±  0%  fsmark.app_overhead
     35.27 ±  0%      +9.8%      38.74 ±  1%  fsmark.files_per_sec
       435 ±  0%      -9.0%        396 ±  1%  fsmark.time.elapsed_time.max
       435 ±  0%      -9.0%        396 ±  1%  fsmark.time.elapsed_time
      5.20 ±  2%     -70.3%       1.54 ±  6%  turbostat.Pkg%pc6
   2447873 ± 42%     -67.9%     785086 ± 33%  numa-numastat.node1.numa_hit
   2413662 ± 43%     -68.1%     771115 ± 31%  numa-numastat.node1.local_node
      9864 ±  2%    +156.9%      25345 ±  4%  time.voluntary_context_switches
    187680 ± 10%    +126.8%     425676 ±  7%  numa-vmstat.node1.nr_dirty
    747361 ±  9%    +127.8%    1702809 ±  7%  numa-meminfo.node1.Dirty
   1787510 ±  1%    +117.0%    3878984 ±  2%  meminfo.Dirty
    446861 ±  1%    +117.0%     969472 ±  2%  proc-vmstat.nr_dirty
   1655962 ± 37%     -59.3%     673988 ± 29%  numa-vmstat.node1.numa_local
   1036191 ±  8%    +110.3%    2179311 ±  3%  numa-meminfo.node0.Dirty
    259069 ±  8%    +110.3%     544783 ±  3%  numa-vmstat.node0.nr_dirty
   1687987 ± 37%     -58.6%     698626 ± 29%  numa-vmstat.node1.numa_hit
         1 ±  0%    +100.0%          2 ±  0%  vmstat.procs.b
      0.02 ±  0%    +100.0%       0.04 ± 22%  turbostat.CPU%c3
      6.03 ±  1%     +76.9%      10.67 ±  1%  turbostat.CPU%c1
 5.189e+08 ±  0%     +72.6%  8.956e+08 ±  1%  cpuidle.C1-IVT.time
   2646692 ±  7%     +75.0%    4630890 ± 23%  cpuidle.C3-IVT.time
      5301 ±  6%     -31.7%       3620 ±  3%  slabinfo.btrfs_ordered_extent.active_objs
     10549 ± 16%     -30.3%       7349 ± 12%  numa-vmstat.node1.nr_slab_reclaimable
      5353 ±  6%     -31.4%       3670 ±  3%  slabinfo.btrfs_ordered_extent.num_objs
     42169 ± 16%     -30.3%      29397 ± 12%  numa-meminfo.node1.SReclaimable
   1619825 ± 22%     +39.4%    2258188 ±  4%  proc-vmstat.pgfree
      4611 ±  7%     -28.0%       3318 ±  1%  slabinfo.btrfs_delayed_ref_head.num_objs
      4471 ±  8%     -27.0%       3264 ±  2%  slabinfo.btrfs_delayed_ref_head.active_objs
     67.93 ±  1%     -24.7%      51.15 ±  4%  turbostat.Pkg%pc2
   2332975 ± 21%     +45.6%    3396446 ±  4%  numa-vmstat.node1.numa_other
   2300949 ± 22%     +46.5%    3371807 ±  4%  numa-vmstat.node1.numa_miss
   2300941 ± 22%     +46.5%    3371793 ±  4%  numa-vmstat.node0.numa_foreign
      2952 ±  8%     -23.3%       2263 ±  3%  slabinfo.btrfs_delayed_data_ref.num_objs
   2570716 ±  3%     +25.7%    3230157 ±  2%  numa-meminfo.node1.Writeback
    642367 ±  3%     +25.7%     807533 ±  2%  numa-vmstat.node1.nr_writeback
     95408 ± 13%     -17.3%      78910 ±  6%  numa-meminfo.node1.Slab
      2803 ±  7%     -21.1%       2210 ±  3%  slabinfo.btrfs_delayed_data_ref.active_objs
       240 ±  9%     +23.1%        295 ± 16%  numa-vmstat.node0.nr_page_table_pages
   4626942 ± 19%     +49.6%    6924087 ± 22%  cpuidle.C1E-IVT.time
   5585235 ±  0%     +25.5%    7011242 ±  0%  meminfo.Writeback
   1396232 ±  0%     +25.5%    1752892 ±  0%  proc-vmstat.nr_writeback
       962 ±  9%     +23.0%       1184 ± 16%  numa-meminfo.node0.PageTables
         9 ±  0%     +17.8%         10 ±  4%  time.percent_of_cpu_this_job_got
    754027 ±  2%     +25.2%     944312 ±  1%  numa-vmstat.node0.nr_writeback
   3018674 ±  2%     +25.1%    3777338 ±  1%  numa-meminfo.node0.Writeback
     23509 ±  1%     -16.9%      19530 ±  0%  slabinfo.kmalloc-1024.active_objs
      2972 ±  1%     +21.4%       3607 ±  0%  proc-vmstat.nr_alloc_batch
     13956 ±  4%     -15.6%      11773 ±  8%  slabinfo.kmalloc-192.active_objs
       743 ±  1%     -16.0%        624 ±  0%  slabinfo.kmalloc-1024.active_slabs
       743 ±  1%     -16.0%        624 ±  0%  slabinfo.kmalloc-1024.num_slabs
     23790 ±  1%     -16.0%      19983 ±  0%  slabinfo.kmalloc-1024.num_objs
     68983 ±  2%     +19.1%      82190 ±  4%  softirqs.RCU
       222 ± 11%     +47.0%        326 ± 25%  cpuidle.POLL.usage
     14177 ±  0%     +17.8%      16702 ±  1%  slabinfo.kmalloc-2048.num_objs
     14045 ±  0%     +18.0%      16568 ±  1%  slabinfo.kmalloc-2048.active_objs
       885 ±  0%     +17.8%       1043 ±  1%  slabinfo.kmalloc-2048.num_slabs
       885 ±  0%     +17.8%       1043 ±  1%  slabinfo.kmalloc-2048.active_slabs
     14025 ±  4%     -13.3%      12157 ±  7%  slabinfo.kmalloc-192.num_objs
   8287205 ± 10%     +16.0%    9611684 ±  0%  numa-numastat.node0.numa_hit
   8276795 ± 10%     +15.9%    9592682 ±  0%  numa-numastat.node0.local_node
   2615463 ±  5%      -9.6%    2365256 ±  2%  numa-vmstat.node1.nr_written
      1814 ±  5%     -12.7%       1584 ± 11%  numa-meminfo.node1.PageTables
       453 ±  5%     -12.6%        396 ± 11%  numa-vmstat.node1.nr_page_table_pages
    105943 ±  6%     +13.6%     120352 ±  2%  numa-meminfo.node0.SReclaimable
     26492 ±  6%     +13.6%      30086 ±  2%  numa-vmstat.node0.nr_slab_reclaimable
      0.41 ±  1%     +17.1%       0.48 ±  4%  time.user_time
      2155 ±  4%     -11.1%       1916 ±  5%  slabinfo.btrfs_delayed_tree_ref.active_objs
 2.028e+10 ±  0%     -11.1%  1.803e+10 ±  1%  cpuidle.C6-IVT.time
      2155 ±  4%     -10.8%       1922 ±  5%  slabinfo.btrfs_delayed_tree_ref.num_objs
      1202 ±  4%     -11.2%       1067 ±  9%  slabinfo.btrfs_trans_handle.num_objs
      1202 ±  4%     -11.2%       1067 ±  9%  slabinfo.btrfs_trans_handle.active_objs
    192641 ±  5%      +9.8%     211569 ±  2%  numa-meminfo.node0.Slab
    268137 ±  0%     +12.2%     300911 ±  2%  cpuidle.C6-IVT.usage
       435 ±  0%      -9.0%        396 ±  1%  time.elapsed_time
       435 ±  0%      -9.0%        396 ±  1%  time.elapsed_time.max
     21057 ±  0%      -9.0%      19165 ±  1%  uptime.idle
     29.89 ±  0%     +37.2%      41.01 ±  3%  turbostat.CorWatt
     59.95 ±  0%     +19.6%      71.69 ±  2%  turbostat.PkgWatt
     18873 ±  0%     +14.9%      21692 ±  1%  vmstat.system.cs
        21 ±  2%      +8.8%         23 ±  3%  turbostat.Avg_MHz
       135 ±  0%      +9.1%        147 ±  0%  iostat.sda.avgqu-sz
      0.69 ±  2%      +7.2%       0.74 ±  3%  turbostat.%Busy
      7478 ±  0%      +5.1%       7861 ±  0%  iostat.sda.await
      7478 ±  0%      +5.1%       7861 ±  0%  iostat.sda.w_await
       239 ±  0%      +3.6%        247 ±  0%  iostat.sda.wrqm/s
      3.54 ±  0%      +3.9%       3.68 ±  0%  turbostat.RAMWatt
    129619 ±  0%      +3.2%     133743 ±  0%  vmstat.io.bo
    128667 ±  0%      +1.9%     131056 ±  0%  iostat.sda.wkB/s


	--yliu

WARNING: multiple messages have this Message-ID (diff)
From: Yuanhan Liu <yuanhan.liu@linux.intel.com>
To: Chris Mason <clm@fb.com>
Cc: Yuanhan Liu <yuanhan.liu@linux.intel.com>,
	lkp@01.org, LKML <linux-kernel@vger.kernel.org>
Subject: performance changes on c9dc4c65:  9.8% fsmark.files_per_sec
Date: Thu, 23 Apr 2015 09:38:01 +0800	[thread overview]
Message-ID: <20150423013801.GP8084@yliu-dev.sh.intel.com> (raw)

FYI, we found performance increasement, which is expected as commit patch says,
on `fsmark.files_per_sec' by c9dc4c6578502c2085705347375b82089aad18d0:

    > commit c9dc4c6578502c2085705347375b82089aad18d0
    > Author:     Chris Mason <clm@fb.com>
    > AuthorDate: Sat Apr 4 17:14:42 2015 -0700
    > Commit:     Chris Mason <clm@fb.com>
    > CommitDate: Fri Apr 10 14:07:11 2015 -0700
    > 
    >     Btrfs: two stage dirty block group writeout

4c6d1d85ad89fd8e32dc9204b7f944854399bda9     c9dc4c6578502c2085705347375b82089aad18d0
----------------------------------------     ----------------------------------------
run time(m)     metric_value     ±stddev     run time(m)     metric_value     ±stddev     change   testbox/benchmark/testcase-params
--- ------  ----------------------------     --- ------  ----------------------------     -------- ------------------------------
3   7.3              |35.267|    ±0.5        5   6.6              |38.740|    ±1.6            9.8% ivb44/fsmark/1x-1t-1HDD-btrfs-4M-60G-NoSync


NOTE: here are some more explanation about those test parameters for you to
      know what the testcase does better:

      1x: where 'x' means iterations or loop, corresponding to the 'L' option of fsmark

      1t, 64t: where 't' means thread

      4M: means the single file size, corresponding to the '-s' option of fsmark
      60G: means the total test size


And FYI, here are more changes by the same commit:

4c6d1d85ad89fd8e  c9dc4c6578502c208570534737  
----------------  --------------------------  
         %stddev     %change         %stddev
             \          |                \  
      9864 ±  2%    +156.9%      25345 ±  4%  fsmark.time.voluntary_context_switches
         9 ±  0%     +17.8%         10 ±  4%  fsmark.time.percent_of_cpu_this_job_got
    462211 ±  1%     +16.8%     539707 ±  0%  fsmark.app_overhead
     35.27 ±  0%      +9.8%      38.74 ±  1%  fsmark.files_per_sec
       435 ±  0%      -9.0%        396 ±  1%  fsmark.time.elapsed_time.max
       435 ±  0%      -9.0%        396 ±  1%  fsmark.time.elapsed_time
      5.20 ±  2%     -70.3%       1.54 ±  6%  turbostat.Pkg%pc6
   2447873 ± 42%     -67.9%     785086 ± 33%  numa-numastat.node1.numa_hit
   2413662 ± 43%     -68.1%     771115 ± 31%  numa-numastat.node1.local_node
      9864 ±  2%    +156.9%      25345 ±  4%  time.voluntary_context_switches
    187680 ± 10%    +126.8%     425676 ±  7%  numa-vmstat.node1.nr_dirty
    747361 ±  9%    +127.8%    1702809 ±  7%  numa-meminfo.node1.Dirty
   1787510 ±  1%    +117.0%    3878984 ±  2%  meminfo.Dirty
    446861 ±  1%    +117.0%     969472 ±  2%  proc-vmstat.nr_dirty
   1655962 ± 37%     -59.3%     673988 ± 29%  numa-vmstat.node1.numa_local
   1036191 ±  8%    +110.3%    2179311 ±  3%  numa-meminfo.node0.Dirty
    259069 ±  8%    +110.3%     544783 ±  3%  numa-vmstat.node0.nr_dirty
   1687987 ± 37%     -58.6%     698626 ± 29%  numa-vmstat.node1.numa_hit
         1 ±  0%    +100.0%          2 ±  0%  vmstat.procs.b
      0.02 ±  0%    +100.0%       0.04 ± 22%  turbostat.CPU%c3
      6.03 ±  1%     +76.9%      10.67 ±  1%  turbostat.CPU%c1
 5.189e+08 ±  0%     +72.6%  8.956e+08 ±  1%  cpuidle.C1-IVT.time
   2646692 ±  7%     +75.0%    4630890 ± 23%  cpuidle.C3-IVT.time
      5301 ±  6%     -31.7%       3620 ±  3%  slabinfo.btrfs_ordered_extent.active_objs
     10549 ± 16%     -30.3%       7349 ± 12%  numa-vmstat.node1.nr_slab_reclaimable
      5353 ±  6%     -31.4%       3670 ±  3%  slabinfo.btrfs_ordered_extent.num_objs
     42169 ± 16%     -30.3%      29397 ± 12%  numa-meminfo.node1.SReclaimable
   1619825 ± 22%     +39.4%    2258188 ±  4%  proc-vmstat.pgfree
      4611 ±  7%     -28.0%       3318 ±  1%  slabinfo.btrfs_delayed_ref_head.num_objs
      4471 ±  8%     -27.0%       3264 ±  2%  slabinfo.btrfs_delayed_ref_head.active_objs
     67.93 ±  1%     -24.7%      51.15 ±  4%  turbostat.Pkg%pc2
   2332975 ± 21%     +45.6%    3396446 ±  4%  numa-vmstat.node1.numa_other
   2300949 ± 22%     +46.5%    3371807 ±  4%  numa-vmstat.node1.numa_miss
   2300941 ± 22%     +46.5%    3371793 ±  4%  numa-vmstat.node0.numa_foreign
      2952 ±  8%     -23.3%       2263 ±  3%  slabinfo.btrfs_delayed_data_ref.num_objs
   2570716 ±  3%     +25.7%    3230157 ±  2%  numa-meminfo.node1.Writeback
    642367 ±  3%     +25.7%     807533 ±  2%  numa-vmstat.node1.nr_writeback
     95408 ± 13%     -17.3%      78910 ±  6%  numa-meminfo.node1.Slab
      2803 ±  7%     -21.1%       2210 ±  3%  slabinfo.btrfs_delayed_data_ref.active_objs
       240 ±  9%     +23.1%        295 ± 16%  numa-vmstat.node0.nr_page_table_pages
   4626942 ± 19%     +49.6%    6924087 ± 22%  cpuidle.C1E-IVT.time
   5585235 ±  0%     +25.5%    7011242 ±  0%  meminfo.Writeback
   1396232 ±  0%     +25.5%    1752892 ±  0%  proc-vmstat.nr_writeback
       962 ±  9%     +23.0%       1184 ± 16%  numa-meminfo.node0.PageTables
         9 ±  0%     +17.8%         10 ±  4%  time.percent_of_cpu_this_job_got
    754027 ±  2%     +25.2%     944312 ±  1%  numa-vmstat.node0.nr_writeback
   3018674 ±  2%     +25.1%    3777338 ±  1%  numa-meminfo.node0.Writeback
     23509 ±  1%     -16.9%      19530 ±  0%  slabinfo.kmalloc-1024.active_objs
      2972 ±  1%     +21.4%       3607 ±  0%  proc-vmstat.nr_alloc_batch
     13956 ±  4%     -15.6%      11773 ±  8%  slabinfo.kmalloc-192.active_objs
       743 ±  1%     -16.0%        624 ±  0%  slabinfo.kmalloc-1024.active_slabs
       743 ±  1%     -16.0%        624 ±  0%  slabinfo.kmalloc-1024.num_slabs
     23790 ±  1%     -16.0%      19983 ±  0%  slabinfo.kmalloc-1024.num_objs
     68983 ±  2%     +19.1%      82190 ±  4%  softirqs.RCU
       222 ± 11%     +47.0%        326 ± 25%  cpuidle.POLL.usage
     14177 ±  0%     +17.8%      16702 ±  1%  slabinfo.kmalloc-2048.num_objs
     14045 ±  0%     +18.0%      16568 ±  1%  slabinfo.kmalloc-2048.active_objs
       885 ±  0%     +17.8%       1043 ±  1%  slabinfo.kmalloc-2048.num_slabs
       885 ±  0%     +17.8%       1043 ±  1%  slabinfo.kmalloc-2048.active_slabs
     14025 ±  4%     -13.3%      12157 ±  7%  slabinfo.kmalloc-192.num_objs
   8287205 ± 10%     +16.0%    9611684 ±  0%  numa-numastat.node0.numa_hit
   8276795 ± 10%     +15.9%    9592682 ±  0%  numa-numastat.node0.local_node
   2615463 ±  5%      -9.6%    2365256 ±  2%  numa-vmstat.node1.nr_written
      1814 ±  5%     -12.7%       1584 ± 11%  numa-meminfo.node1.PageTables
       453 ±  5%     -12.6%        396 ± 11%  numa-vmstat.node1.nr_page_table_pages
    105943 ±  6%     +13.6%     120352 ±  2%  numa-meminfo.node0.SReclaimable
     26492 ±  6%     +13.6%      30086 ±  2%  numa-vmstat.node0.nr_slab_reclaimable
      0.41 ±  1%     +17.1%       0.48 ±  4%  time.user_time
      2155 ±  4%     -11.1%       1916 ±  5%  slabinfo.btrfs_delayed_tree_ref.active_objs
 2.028e+10 ±  0%     -11.1%  1.803e+10 ±  1%  cpuidle.C6-IVT.time
      2155 ±  4%     -10.8%       1922 ±  5%  slabinfo.btrfs_delayed_tree_ref.num_objs
      1202 ±  4%     -11.2%       1067 ±  9%  slabinfo.btrfs_trans_handle.num_objs
      1202 ±  4%     -11.2%       1067 ±  9%  slabinfo.btrfs_trans_handle.active_objs
    192641 ±  5%      +9.8%     211569 ±  2%  numa-meminfo.node0.Slab
    268137 ±  0%     +12.2%     300911 ±  2%  cpuidle.C6-IVT.usage
       435 ±  0%      -9.0%        396 ±  1%  time.elapsed_time
       435 ±  0%      -9.0%        396 ±  1%  time.elapsed_time.max
     21057 ±  0%      -9.0%      19165 ±  1%  uptime.idle
     29.89 ±  0%     +37.2%      41.01 ±  3%  turbostat.CorWatt
     59.95 ±  0%     +19.6%      71.69 ±  2%  turbostat.PkgWatt
     18873 ±  0%     +14.9%      21692 ±  1%  vmstat.system.cs
        21 ±  2%      +8.8%         23 ±  3%  turbostat.Avg_MHz
       135 ±  0%      +9.1%        147 ±  0%  iostat.sda.avgqu-sz
      0.69 ±  2%      +7.2%       0.74 ±  3%  turbostat.%Busy
      7478 ±  0%      +5.1%       7861 ±  0%  iostat.sda.await
      7478 ±  0%      +5.1%       7861 ±  0%  iostat.sda.w_await
       239 ±  0%      +3.6%        247 ±  0%  iostat.sda.wrqm/s
      3.54 ±  0%      +3.9%       3.68 ±  0%  turbostat.RAMWatt
    129619 ±  0%      +3.2%     133743 ±  0%  vmstat.io.bo
    128667 ±  0%      +1.9%     131056 ±  0%  iostat.sda.wkB/s


	--yliu

             reply	other threads:[~2015-04-23  1:38 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-23  1:38 Yuanhan Liu [this message]
2015-04-23  1:38 ` performance changes on c9dc4c65: 9.8% fsmark.files_per_sec Yuanhan Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150423013801.GP8084@yliu-dev.sh.intel.com \
    --to=yuanhan.liu@linux.intel.com \
    --cc=lkp@lists.01.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.