* performance changes on 4400755e: 200.0% fsmark.files_per_sec, -18.1% fsmark.files_per_sec, and few more
@ 2015-03-18 5:00 ` Yuanahn Liu
0 siblings, 0 replies; 6+ messages in thread
From: Yuanahn Liu @ 2015-03-18 5:00 UTC (permalink / raw)
To: lkp
[-- Attachment #1: Type: text/plain, Size: 19093 bytes --]
Hi,
FYI, we noticed performance changes on `fsmark.files_per_sec' by 4400755e356f9a2b0b7ceaa02f57b1c7546c3765:
> commit 4400755e356f9a2b0b7ceaa02f57b1c7546c3765
> Author: NeilBrown <neilb@suse.de>
> AuthorDate: Thu Feb 26 12:47:56 2015 +1100
> Commit: NeilBrown <neilb@suse.de>
> CommitDate: Wed Mar 4 13:40:19 2015 +1100
>
> md/raid5: allow the stripe_cache to grow and shrink.
26089f4902595a2f64c512066af07af6e82eb096 4400755e356f9a2b0b7ceaa02f57b1c7546c3765
---------------------------------------- ----------------------------------------
run time(m) metric_value ±stddev run time(m) metric_value ±stddev change testbox/benchmark/sub-testcase
--- ------ ---------------------------- --- ------ ---------------------------- -------- ------------------------------
3 18.6 6.400 ±0.0% 5 9.2 19.200 ±0.0% 200.0% ivb44/fsmark/1x-64t-3HDD-RAID5-ext4-4M-40G-fsyncBeforeClose
3 24.7 6.400 ±0.0% 3 13.7 12.800 ±0.0% 100.0% ivb44/fsmark/1x-64t-3HDD-RAID5-f2fs-4M-40G-fsyncBeforeClose
3 17.5 28.267 ±9.6% 3 12.3 42.833 ±6.5% 51.5% ivb44/fsmark/1x-64t-3HDD-RAID5-f2fs-4M-120G-NoSync
3 16.7 30.700 ±1.5% 3 12.6 40.733 ±2.4% 32.7% ivb44/fsmark/1x-1t-3HDD-RAID5-f2fs-4M-120G-NoSync
3 29.0 5.867 ±0.8% 5 23.6 7.240 ±0.7% 23.4% ivb44/fsmark/1x-1t-3HDD-RAID5-btrfs-4M-40G-fsyncBeforeClose
3 28.5 6.000 ±0.0% 3 23.2 7.367 ±0.6% 22.8% ivb44/fsmark/1x-1t-3HDD-RAID5-f2fs-4M-40G-fsyncBeforeClose
5 11.7 14.600 ±0.0% 5 9.7 17.500 ±0.4% 19.9% ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-40G-fsyncBeforeClose
3 22.4 25.600 ±0.0% 5 17.9 30.120 ±4.1% 17.7% ivb44/fsmark/1x-64t-3HDD-RAID5-xfs-4M-120G-NoSync
5 10.8 47.320 ±0.6% 5 9.3 54.820 ±0.2% 15.8% ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-120G-NoSync
1 0.5 252.400 ±0.0% 1 0.5 263.300 ±0.0% 4.3% ivb44/fsmark/1x-1t-4BRD_12G-RAID5-ext4-4M-30G-NoSync
3 0.5 273.100 ±4.3% 3 0.6 223.567 ±6.5% -18.1% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-btrfs-4M-30G-NoSync
3 8.1 63.133 ±0.5% 3 9.2 55.633 ±0.2% -11.9% ivb44/fsmark/1x-1t-3HDD-RAID5-btrfs-4M-120G-NoSync
3 8.2 64.000 ±0.0% 3 9.2 57.600 ±0.0% -10.0% ivb44/fsmark/1x-64t-3HDD-RAID5-btrfs-4M-120G-NoSync
NOTE: here are some more info about those test parameters for you to
understand the testcase better:
1x: where 'x' means iterations or loop, corresponding to the 'L' option of fsmark
1t, 64t: where 't' means thread
4M: means the single file size, corresponding to the '-s' option of fsmark
40G, 30G, 120G: means the total test size
4BRD_12G: BRD is the ramdisk, where '4' means 4 ramdisk, and where '12G' means
the size of one ramdisk. So, it would be 48G in total. And we made a
raid on those ramdisk.
As you can see from above data, interestingly, all performance
regressions come from btrfs testing. That's why Chris is also
in the cc list, with which just FYI.
FYI, here I listed more detailed changes for the maximal postive and negtive changes.
more detailed changes about ivb44/fsmark/1x-64t-3HDD-RAID5-ext4-4M-40G-fsyncBeforeClose
---------
26089f4902595a2f 4400755e356f9a2b0b7ceaa02f
---------------- --------------------------
%stddev %change %stddev
\ | \
6.40 ± 0% +200.0% 19.20 ± 0% fsmark.files_per_sec
1.015e+08 ± 1% -73.6% 26767355 ± 3% fsmark.time.voluntary_context_switches
13793 ± 1% -73.9% 3603 ± 5% fsmark.time.system_time
78473 ± 6% -64.3% 28016 ± 7% fsmark.time.involuntary_context_switches
15789555 ± 9% -54.7% 7159485 ± 13% fsmark.app_overhead
1115 ± 0% -50.3% 554 ± 1% fsmark.time.elapsed_time.max
1115 ± 0% -50.3% 554 ± 1% fsmark.time.elapsed_time
1235 ± 2% -47.5% 649 ± 3% fsmark.time.percent_of_cpu_this_job_got
456465 ± 1% -26.7% 334594 ± 4% fsmark.time.minor_page_faults
275 ± 0% +1257.7% 3733 ± 2% slabinfo.raid5-md0.num_objs
275 ± 0% +1257.7% 3733 ± 2% slabinfo.raid5-md0.active_objs
11 ± 0% +1250.9% 148 ± 2% slabinfo.raid5-md0.active_slabs
11 ± 0% +1250.9% 148 ± 2% slabinfo.raid5-md0.num_slabs
2407 ± 4% +293.4% 9471 ± 26% numa-meminfo.node0.Writeback
600 ± 4% +294.9% 2372 ± 26% numa-vmstat.node0.nr_writeback
1114505 ± 0% -77.4% 251696 ± 2% softirqs.TASKLET
1808027 ± 1% -77.7% 402378 ± 4% softirqs.RCU
12158665 ± 1% -77.1% 2786069 ± 4% cpuidle.C3-IVT.usage
1119433 ± 0% -77.3% 254192 ± 2% softirqs.BLOCK
37824202 ± 1% -75.1% 9405078 ± 4% cpuidle.C6-IVT.usage
1.015e+08 ± 1% -73.6% 26767355 ± 3% time.voluntary_context_switches
13793 ± 1% -73.9% 3603 ± 5% time.system_time
5971084 ± 1% -73.6% 1574912 ± 5% softirqs.SCHED
10539492 ± 3% -72.0% 2956258 ± 6% cpuidle.C1E-IVT.usage
2 ± 0% +230.0% 6 ± 12% vmstat.procs.b
14064 ± 1% -71.2% 4049 ± 6% softirqs.HRTIMER
7388306 ± 1% -71.2% 2129929 ± 4% softirqs.TIMER
3.496e+09 ± 1% -70.3% 1.04e+09 ± 1% cpuidle.C3-IVT.time
0.88 ± 6% +224.9% 2.87 ± 11% turbostat.Pkg%pc6
19969464 ± 2% -66.2% 6750675 ± 5% cpuidle.C1-IVT.usage
78473 ± 6% -64.3% 28016 ± 7% time.involuntary_context_switches
4.23 ± 5% +181.4% 11.90 ± 3% turbostat.Pkg%pc2
2.551e+09 ± 1% -61.4% 9.837e+08 ± 3% cpuidle.C1E-IVT.time
8084 ± 3% +142.6% 19608 ± 3% meminfo.Writeback
2026 ± 4% +141.6% 4895 ± 4% proc-vmstat.nr_writeback
165 ± 4% -56.9% 71 ± 14% numa-vmstat.node1.nr_inactive_anon
7.748e+09 ± 3% -50.3% 3.852e+09 ± 3% cpuidle.C1-IVT.time
175 ± 5% -53.2% 82 ± 13% numa-vmstat.node1.nr_shmem
1115 ± 0% -50.3% 554 ± 1% time.elapsed_time.max
1115 ± 0% -50.3% 554 ± 1% time.elapsed_time
1147 ± 0% -49.0% 585 ± 1% uptime.boot
2260889 ± 0% -48.8% 1157272 ± 1% proc-vmstat.pgfree
16805 ± 2% -35.9% 10776 ± 23% numa-vmstat.node1.nr_dirty
1235 ± 2% -47.5% 649 ± 3% time.percent_of_cpu_this_job_got
67245 ± 2% -35.9% 43122 ± 23% numa-meminfo.node1.Dirty
39041 ± 0% -45.7% 21212 ± 2% uptime.idle
13 ± 9% -49.0% 6 ± 11% vmstat.procs.r
3072 ± 10% -40.3% 1833 ± 9% cpuidle.POLL.usage
3045115 ± 0% -46.1% 1642053 ± 1% proc-vmstat.pgfault
202 ± 1% -45.2% 110 ± 0% proc-vmstat.nr_inactive_anon
4583079 ± 2% -31.4% 3143602 ± 16% numa-vmstat.node1.numa_hit
28.03 ± 0% +69.1% 47.39 ± 1% turbostat.CPU%c6
223 ± 1% -41.1% 131 ± 1% proc-vmstat.nr_shmem
4518820 ± 3% -30.8% 3128304 ± 16% numa-vmstat.node1.numa_local
3363496 ± 3% -27.4% 2441619 ± 20% numa-vmstat.node1.nr_dirtied
3345346 ± 3% -27.4% 2428396 ± 20% numa-vmstat.node1.nr_written
0.18 ± 18% +105.6% 0.37 ± 36% turbostat.Pkg%pc3
3427913 ± 3% -27.3% 2492563 ± 20% numa-vmstat.node1.nr_inactive_file
13712431 ± 3% -27.3% 9971152 ± 20% numa-meminfo.node1.Inactive
13711768 ± 3% -27.3% 9970866 ± 20% numa-meminfo.node1.Inactive(file)
3444598 ± 3% -27.2% 2508920 ± 20% numa-vmstat.node1.nr_file_pages
13778510 ± 3% -27.2% 10036287 ± 20% numa-meminfo.node1.FilePages
8819175 ± 1% -28.3% 6320188 ± 19% numa-numastat.node1.numa_hit
8819051 ± 1% -28.3% 6320152 ± 19% numa-numastat.node1.local_node
14350918 ± 3% -26.8% 10504070 ± 19% numa-meminfo.node1.MemUsed
100892 ± 3% -26.0% 74623 ± 19% numa-vmstat.node1.nr_slab_reclaimable
403571 ± 3% -26.0% 298513 ± 19% numa-meminfo.node1.SReclaimable
3525 ± 13% +36.6% 4817 ± 14% slabinfo.blkdev_requests.active_objs
3552 ± 13% +36.3% 4841 ± 14% slabinfo.blkdev_requests.num_objs
30779 ± 4% -34.7% 20084 ± 12% proc-vmstat.pgmigrate_success
30779 ± 4% -34.7% 20084 ± 12% proc-vmstat.numa_pages_migrated
447400 ± 2% -23.2% 343701 ± 16% numa-meminfo.node1.Slab
2.532e+10 ± 0% -33.1% 1.694e+10 ± 1% cpuidle.C6-IVT.time
3081 ± 9% +28.0% 3945 ± 12% slabinfo.mnt_cache.num_objs
3026 ± 9% +28.8% 3898 ± 12% slabinfo.mnt_cache.active_objs
5822 ± 4% +77.8% 10350 ± 25% numa-meminfo.node1.Writeback
1454 ± 4% +77.3% 2579 ± 25% numa-vmstat.node1.nr_writeback
424984 ± 1% -26.5% 312255 ± 3% proc-vmstat.numa_pte_updates
368001 ± 1% -26.8% 269440 ± 3% proc-vmstat.numa_hint_faults
456465 ± 1% -26.7% 334594 ± 4% time.minor_page_faults
3.86 ± 3% -24.4% 2.92 ± 2% turbostat.CPU%c3
4661151 ± 2% +20.6% 5622999 ± 9% numa-vmstat.node1.nr_free_pages
18644452 ± 2% +20.6% 22491300 ± 9% numa-meminfo.node1.MemFree
876 ± 2% +28.2% 1124 ± 5% slabinfo.kmalloc-4096.num_objs
858 ± 3% +24.0% 1064 ± 5% slabinfo.kmalloc-4096.active_objs
17767832 ± 8% -25.4% 13249545 ± 17% cpuidle.POLL.time
285093 ± 1% -23.1% 219372 ± 5% proc-vmstat.numa_hint_faults_local
105423 ± 2% -16.1% 88498 ± 0% meminfo.Dirty
26365 ± 1% -16.0% 22152 ± 1% proc-vmstat.nr_dirty
41.04 ± 1% -14.1% 35.26 ± 1% turbostat.CPU%c1
9385 ± 4% -14.3% 8043 ± 6% slabinfo.kmalloc-192.active_objs
9574 ± 3% -13.9% 8241 ± 6% slabinfo.kmalloc-192.num_objs
2411 ± 3% +17.0% 2820 ± 4% slabinfo.kmalloc-2048.active_objs
12595574 ± 0% -10.0% 11338368 ± 1% proc-vmstat.pgalloc_normal
5262 ± 1% +13.3% 5962 ± 1% slabinfo.kmalloc-1024.num_objs
5262 ± 1% +12.7% 5932 ± 1% slabinfo.kmalloc-1024.active_objs
2538 ± 3% +13.7% 2885 ± 4% slabinfo.kmalloc-2048.num_objs
5299546 ± 0% -9.9% 4776351 ± 0% slabinfo.buffer_head.active_objs
5299546 ± 0% -9.9% 4776351 ± 0% slabinfo.buffer_head.num_objs
135885 ± 0% -9.9% 122470 ± 0% slabinfo.buffer_head.num_slabs
135885 ± 0% -9.9% 122470 ± 0% slabinfo.buffer_head.active_slabs
28.04 ± 2% +715.6% 228.69 ± 3% iostat.sdb.avgrq-sz
28.05 ± 2% +708.1% 226.72 ± 2% iostat.sdc.avgrq-sz
2245 ± 3% -81.6% 413 ± 1% iostat.sda.w/s
5.33 ± 1% +1008.2% 59.07 ± 1% iostat.sda.w_await
5.85 ± 1% +1126.4% 71.69 ± 4% iostat.sda.r_await
5.36 ± 1% +978.6% 57.79 ± 3% iostat.sdc.w_await
1263 ± 4% -85.8% 179 ± 6% iostat.sdc.r/s
2257 ± 3% -81.6% 414 ± 2% iostat.sdb.w/s
1264 ± 4% -85.8% 179 ± 6% iostat.sdb.r/s
5.55 ± 0% +1024.2% 62.37 ± 4% iostat.sdb.await
5.89 ± 1% +1125.9% 72.16 ± 6% iostat.sdb.r_await
5.36 ± 0% +1014.3% 59.75 ± 3% iostat.sdb.w_await
5.57 ± 1% +987.9% 60.55 ± 3% iostat.sdc.await
5.51 ± 0% +1017.3% 61.58 ± 1% iostat.sda.await
1264 ± 4% -85.8% 179 ± 6% iostat.sda.r/s
28.09 ± 2% +714.2% 228.73 ± 2% iostat.sda.avgrq-sz
5.95 ± 2% +1091.0% 70.82 ± 6% iostat.sdc.r_await
2252 ± 3% -81.5% 417 ± 2% iostat.sdc.w/s
4032 ± 2% +151.6% 10143 ± 1% iostat.sdb.wrqm/s
4043 ± 2% +151.0% 10150 ± 1% iostat.sda.wrqm/s
4035 ± 2% +151.2% 10138 ± 1% iostat.sdc.wrqm/s
26252 ± 1% -54.0% 12077 ± 4% vmstat.system.in
37813 ± 0% +101.0% 75998 ± 1% vmstat.io.bo
37789 ± 0% +101.0% 75945 ± 1% iostat.md0.wkB/s
205 ± 0% +96.1% 402 ± 1% iostat.md0.w/s
164286 ± 1% -46.2% 88345 ± 2% vmstat.system.cs
27.07 ± 2% -46.7% 14.42 ± 3% turbostat.%Busy
810 ± 2% -46.7% 431 ± 3% turbostat.Avg_MHz
15.56 ± 2% +71.7% 26.71 ± 1% iostat.sda.avgqu-sz
15.65 ± 2% +69.1% 26.46 ± 2% iostat.sdc.avgqu-sz
15.67 ± 2% +72.7% 27.06 ± 2% iostat.sdb.avgqu-sz
25151 ± 0% +68.3% 42328 ± 1% iostat.sda.wkB/s
25153 ± 0% +68.2% 42305 ± 1% iostat.sdb.wkB/s
25149 ± 0% +68.2% 42292 ± 1% iostat.sdc.wkB/s
97.45 ± 0% -21.1% 76.90 ± 0% turbostat.CorWatt
12517 ± 0% -20.2% 9994 ± 1% iostat.sdc.rkB/s
12517 ± 0% -20.0% 10007 ± 1% iostat.sda.rkB/s
12512 ± 0% -19.9% 10018 ± 1% iostat.sdb.rkB/s
1863 ± 3% +24.7% 2325 ± 1% iostat.sdb.rrqm/s
1865 ± 3% +24.3% 2319 ± 1% iostat.sdc.rrqm/s
1864 ± 3% +24.6% 2322 ± 1% iostat.sda.rrqm/s
128 ± 0% -16.4% 107 ± 0% turbostat.PkgWatt
150569 ± 0% -8.7% 137525 ± 0% iostat.md0.avgqu-sz
4.29 ± 0% -5.1% 4.07 ± 0% turbostat.RAMWatt
more detailed changes about ivb44/fsmark/1x-64t-4BRD_12G-RAID5-btrfs-4M-30G-NoSync
---------
26089f4902595a2f 4400755e356f9a2b0b7ceaa02f
---------------- --------------------------
%stddev %change %stddev
\ | \
273 ± 4% -18.1% 223 ± 6% fsmark.files_per_sec
29.24 ± 1% +27.2% 37.20 ± 8% fsmark.time.elapsed_time.max
29.24 ± 1% +27.2% 37.20 ± 8% fsmark.time.elapsed_time
399 ± 4% -20.0% 319 ± 3% fsmark.time.percent_of_cpu_this_job_got
129891 ± 20% -28.9% 92334 ± 15% fsmark.time.voluntary_context_switches
266 ± 0% +413.4% 1365 ± 5% slabinfo.raid5-md0.num_objs
266 ± 0% +413.4% 1365 ± 5% slabinfo.raid5-md0.active_objs
0.23 ± 27% +98.6% 0.46 ± 35% turbostat.CPU%c3
56612063 ± 9% +36.7% 77369763 ± 20% cpuidle.C1-IVT.time
5579498 ± 14% -36.0% 3571516 ± 6% cpuidle.C1E-IVT.time
4668 ± 38% +64.7% 7690 ± 19% numa-vmstat.node0.nr_unevictable
18674 ± 38% +64.7% 30762 ± 19% numa-meminfo.node0.Unevictable
9298 ± 37% +64.4% 15286 ± 19% proc-vmstat.nr_unevictable
4629 ± 37% +64.1% 7596 ± 19% numa-vmstat.node1.nr_unevictable
18535 ± 37% +63.9% 30385 ± 19% numa-meminfo.node1.Unevictable
4270894 ± 19% +65.6% 7070923 ± 21% cpuidle.C3-IVT.time
38457 ± 37% +59.0% 61148 ± 19% meminfo.Unevictable
3748226 ± 17% +26.6% 4743674 ± 16% numa-vmstat.node0.numa_local
4495283 ± 13% -24.8% 3382315 ± 17% numa-vmstat.node0.nr_free_pages
3818432 ± 16% +26.5% 4830938 ± 16% numa-vmstat.node0.numa_hit
17966826 ± 13% -24.7% 13537228 ± 17% numa-meminfo.node0.MemFree
14901309 ± 15% +29.7% 19330906 ± 12% numa-meminfo.node0.MemUsed
26 ± 21% -32.9% 17 ± 14% cpuidle.POLL.usage
1.183e+09 ± 1% +29.6% 1.533e+09 ± 8% cpuidle.C6-IVT.time
29.24 ± 1% +27.2% 37.20 ± 8% time.elapsed_time
29.24 ± 1% +27.2% 37.20 ± 8% time.elapsed_time.max
399 ± 4% -20.0% 319 ± 3% time.percent_of_cpu_this_job_got
850 ± 4% -8.6% 777 ± 5% slabinfo.blkdev_requests.num_objs
850 ± 4% -8.6% 777 ± 5% slabinfo.blkdev_requests.active_objs
14986 ± 9% +17.1% 17548 ± 8% numa-vmstat.node0.nr_slab_reclaimable
11943 ± 5% -12.6% 10441 ± 2% slabinfo.kmalloc-192.num_objs
59986 ± 9% +17.0% 70186 ± 8% numa-meminfo.node0.SReclaimable
3703 ± 6% +10.2% 4082 ± 7% slabinfo.btrfs_delayed_data_ref.num_objs
133551 ± 6% +16.1% 154995 ± 1% proc-vmstat.pgfault
129891 ± 20% -28.9% 92334 ± 15% time.voluntary_context_switches
11823 ± 4% -12.0% 10409 ± 3% slabinfo.kmalloc-192.active_objs
3703 ± 6% +9.7% 4061 ± 7% slabinfo.btrfs_delayed_data_ref.active_objs
19761 ± 2% -11.2% 17542 ± 6% slabinfo.anon_vma.active_objs
19761 ± 2% -11.2% 17544 ± 6% slabinfo.anon_vma.num_objs
13002 ± 3% +14.9% 14944 ± 5% slabinfo.kmalloc-256.num_objs
12695 ± 3% +13.8% 14446 ± 7% slabinfo.kmalloc-256.active_objs
1190 ± 1% -11.8% 1050 ± 3% slabinfo.mnt_cache.num_objs
1190 ± 1% -11.8% 1050 ± 3% slabinfo.mnt_cache.active_objs
136862 ± 1% -13.8% 117938 ± 7% cpuidle.C6-IVT.usage
1692630 ± 3% +12.3% 1900854 ± 0% numa-vmstat.node0.nr_written
1056 ± 2% +8.8% 1149 ± 3% slabinfo.mm_struct.active_objs
1056 ± 2% +8.8% 1149 ± 3% slabinfo.mm_struct.num_objs
24029 ± 11% -30.6% 16673 ± 8% vmstat.system.cs
8859 ± 2% -15.0% 7530 ± 8% vmstat.system.in
905630 ± 2% -16.8% 753097 ± 4% iostat.md0.wkB/s
906433 ± 2% -16.9% 753482 ± 4% vmstat.io.bo
3591 ± 2% -16.9% 2982 ± 4% iostat.md0.w/s
13.22 ± 5% -16.3% 11.07 ± 1% turbostat.%Busy
402 ± 4% -15.9% 338 ± 1% turbostat.Avg_MHz
54236 ± 3% +10.4% 59889 ± 4% iostat.md0.avgqu-sz
7.67 ± 1% +4.5% 8.01 ± 1% turbostat.RAMWatt
--yliu
^ permalink raw reply [flat|nested] 6+ messages in thread* performance changes on 4400755e: 200.0% fsmark.files_per_sec, -18.1% fsmark.files_per_sec, and few more @ 2015-03-18 5:00 ` Yuanahn Liu 0 siblings, 0 replies; 6+ messages in thread From: Yuanahn Liu @ 2015-03-18 5:00 UTC (permalink / raw) To: NeilBrown; +Cc: lkp, lkp, LKML, Chris Mason Hi, FYI, we noticed performance changes on `fsmark.files_per_sec' by 4400755e356f9a2b0b7ceaa02f57b1c7546c3765: > commit 4400755e356f9a2b0b7ceaa02f57b1c7546c3765 > Author: NeilBrown <neilb@suse.de> > AuthorDate: Thu Feb 26 12:47:56 2015 +1100 > Commit: NeilBrown <neilb@suse.de> > CommitDate: Wed Mar 4 13:40:19 2015 +1100 > > md/raid5: allow the stripe_cache to grow and shrink. 26089f4902595a2f64c512066af07af6e82eb096 4400755e356f9a2b0b7ceaa02f57b1c7546c3765 ---------------------------------------- ---------------------------------------- run time(m) metric_value ±stddev run time(m) metric_value ±stddev change testbox/benchmark/sub-testcase --- ------ ---------------------------- --- ------ ---------------------------- -------- ------------------------------ 3 18.6 6.400 ±0.0% 5 9.2 19.200 ±0.0% 200.0% ivb44/fsmark/1x-64t-3HDD-RAID5-ext4-4M-40G-fsyncBeforeClose 3 24.7 6.400 ±0.0% 3 13.7 12.800 ±0.0% 100.0% ivb44/fsmark/1x-64t-3HDD-RAID5-f2fs-4M-40G-fsyncBeforeClose 3 17.5 28.267 ±9.6% 3 12.3 42.833 ±6.5% 51.5% ivb44/fsmark/1x-64t-3HDD-RAID5-f2fs-4M-120G-NoSync 3 16.7 30.700 ±1.5% 3 12.6 40.733 ±2.4% 32.7% ivb44/fsmark/1x-1t-3HDD-RAID5-f2fs-4M-120G-NoSync 3 29.0 5.867 ±0.8% 5 23.6 7.240 ±0.7% 23.4% ivb44/fsmark/1x-1t-3HDD-RAID5-btrfs-4M-40G-fsyncBeforeClose 3 28.5 6.000 ±0.0% 3 23.2 7.367 ±0.6% 22.8% ivb44/fsmark/1x-1t-3HDD-RAID5-f2fs-4M-40G-fsyncBeforeClose 5 11.7 14.600 ±0.0% 5 9.7 17.500 ±0.4% 19.9% ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-40G-fsyncBeforeClose 3 22.4 25.600 ±0.0% 5 17.9 30.120 ±4.1% 17.7% ivb44/fsmark/1x-64t-3HDD-RAID5-xfs-4M-120G-NoSync 5 10.8 47.320 ±0.6% 5 9.3 54.820 ±0.2% 15.8% ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-120G-NoSync 1 0.5 252.400 ±0.0% 1 0.5 263.300 ±0.0% 4.3% ivb44/fsmark/1x-1t-4BRD_12G-RAID5-ext4-4M-30G-NoSync 3 0.5 273.100 ±4.3% 3 0.6 223.567 ±6.5% -18.1% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-btrfs-4M-30G-NoSync 3 8.1 63.133 ±0.5% 3 9.2 55.633 ±0.2% -11.9% ivb44/fsmark/1x-1t-3HDD-RAID5-btrfs-4M-120G-NoSync 3 8.2 64.000 ±0.0% 3 9.2 57.600 ±0.0% -10.0% ivb44/fsmark/1x-64t-3HDD-RAID5-btrfs-4M-120G-NoSync NOTE: here are some more info about those test parameters for you to understand the testcase better: 1x: where 'x' means iterations or loop, corresponding to the 'L' option of fsmark 1t, 64t: where 't' means thread 4M: means the single file size, corresponding to the '-s' option of fsmark 40G, 30G, 120G: means the total test size 4BRD_12G: BRD is the ramdisk, where '4' means 4 ramdisk, and where '12G' means the size of one ramdisk. So, it would be 48G in total. And we made a raid on those ramdisk. As you can see from above data, interestingly, all performance regressions come from btrfs testing. That's why Chris is also in the cc list, with which just FYI. FYI, here I listed more detailed changes for the maximal postive and negtive changes. more detailed changes about ivb44/fsmark/1x-64t-3HDD-RAID5-ext4-4M-40G-fsyncBeforeClose --------- 26089f4902595a2f 4400755e356f9a2b0b7ceaa02f ---------------- -------------------------- %stddev %change %stddev \ | \ 6.40 ± 0% +200.0% 19.20 ± 0% fsmark.files_per_sec 1.015e+08 ± 1% -73.6% 26767355 ± 3% fsmark.time.voluntary_context_switches 13793 ± 1% -73.9% 3603 ± 5% fsmark.time.system_time 78473 ± 6% -64.3% 28016 ± 7% fsmark.time.involuntary_context_switches 15789555 ± 9% -54.7% 7159485 ± 13% fsmark.app_overhead 1115 ± 0% -50.3% 554 ± 1% fsmark.time.elapsed_time.max 1115 ± 0% -50.3% 554 ± 1% fsmark.time.elapsed_time 1235 ± 2% -47.5% 649 ± 3% fsmark.time.percent_of_cpu_this_job_got 456465 ± 1% -26.7% 334594 ± 4% fsmark.time.minor_page_faults 275 ± 0% +1257.7% 3733 ± 2% slabinfo.raid5-md0.num_objs 275 ± 0% +1257.7% 3733 ± 2% slabinfo.raid5-md0.active_objs 11 ± 0% +1250.9% 148 ± 2% slabinfo.raid5-md0.active_slabs 11 ± 0% +1250.9% 148 ± 2% slabinfo.raid5-md0.num_slabs 2407 ± 4% +293.4% 9471 ± 26% numa-meminfo.node0.Writeback 600 ± 4% +294.9% 2372 ± 26% numa-vmstat.node0.nr_writeback 1114505 ± 0% -77.4% 251696 ± 2% softirqs.TASKLET 1808027 ± 1% -77.7% 402378 ± 4% softirqs.RCU 12158665 ± 1% -77.1% 2786069 ± 4% cpuidle.C3-IVT.usage 1119433 ± 0% -77.3% 254192 ± 2% softirqs.BLOCK 37824202 ± 1% -75.1% 9405078 ± 4% cpuidle.C6-IVT.usage 1.015e+08 ± 1% -73.6% 26767355 ± 3% time.voluntary_context_switches 13793 ± 1% -73.9% 3603 ± 5% time.system_time 5971084 ± 1% -73.6% 1574912 ± 5% softirqs.SCHED 10539492 ± 3% -72.0% 2956258 ± 6% cpuidle.C1E-IVT.usage 2 ± 0% +230.0% 6 ± 12% vmstat.procs.b 14064 ± 1% -71.2% 4049 ± 6% softirqs.HRTIMER 7388306 ± 1% -71.2% 2129929 ± 4% softirqs.TIMER 3.496e+09 ± 1% -70.3% 1.04e+09 ± 1% cpuidle.C3-IVT.time 0.88 ± 6% +224.9% 2.87 ± 11% turbostat.Pkg%pc6 19969464 ± 2% -66.2% 6750675 ± 5% cpuidle.C1-IVT.usage 78473 ± 6% -64.3% 28016 ± 7% time.involuntary_context_switches 4.23 ± 5% +181.4% 11.90 ± 3% turbostat.Pkg%pc2 2.551e+09 ± 1% -61.4% 9.837e+08 ± 3% cpuidle.C1E-IVT.time 8084 ± 3% +142.6% 19608 ± 3% meminfo.Writeback 2026 ± 4% +141.6% 4895 ± 4% proc-vmstat.nr_writeback 165 ± 4% -56.9% 71 ± 14% numa-vmstat.node1.nr_inactive_anon 7.748e+09 ± 3% -50.3% 3.852e+09 ± 3% cpuidle.C1-IVT.time 175 ± 5% -53.2% 82 ± 13% numa-vmstat.node1.nr_shmem 1115 ± 0% -50.3% 554 ± 1% time.elapsed_time.max 1115 ± 0% -50.3% 554 ± 1% time.elapsed_time 1147 ± 0% -49.0% 585 ± 1% uptime.boot 2260889 ± 0% -48.8% 1157272 ± 1% proc-vmstat.pgfree 16805 ± 2% -35.9% 10776 ± 23% numa-vmstat.node1.nr_dirty 1235 ± 2% -47.5% 649 ± 3% time.percent_of_cpu_this_job_got 67245 ± 2% -35.9% 43122 ± 23% numa-meminfo.node1.Dirty 39041 ± 0% -45.7% 21212 ± 2% uptime.idle 13 ± 9% -49.0% 6 ± 11% vmstat.procs.r 3072 ± 10% -40.3% 1833 ± 9% cpuidle.POLL.usage 3045115 ± 0% -46.1% 1642053 ± 1% proc-vmstat.pgfault 202 ± 1% -45.2% 110 ± 0% proc-vmstat.nr_inactive_anon 4583079 ± 2% -31.4% 3143602 ± 16% numa-vmstat.node1.numa_hit 28.03 ± 0% +69.1% 47.39 ± 1% turbostat.CPU%c6 223 ± 1% -41.1% 131 ± 1% proc-vmstat.nr_shmem 4518820 ± 3% -30.8% 3128304 ± 16% numa-vmstat.node1.numa_local 3363496 ± 3% -27.4% 2441619 ± 20% numa-vmstat.node1.nr_dirtied 3345346 ± 3% -27.4% 2428396 ± 20% numa-vmstat.node1.nr_written 0.18 ± 18% +105.6% 0.37 ± 36% turbostat.Pkg%pc3 3427913 ± 3% -27.3% 2492563 ± 20% numa-vmstat.node1.nr_inactive_file 13712431 ± 3% -27.3% 9971152 ± 20% numa-meminfo.node1.Inactive 13711768 ± 3% -27.3% 9970866 ± 20% numa-meminfo.node1.Inactive(file) 3444598 ± 3% -27.2% 2508920 ± 20% numa-vmstat.node1.nr_file_pages 13778510 ± 3% -27.2% 10036287 ± 20% numa-meminfo.node1.FilePages 8819175 ± 1% -28.3% 6320188 ± 19% numa-numastat.node1.numa_hit 8819051 ± 1% -28.3% 6320152 ± 19% numa-numastat.node1.local_node 14350918 ± 3% -26.8% 10504070 ± 19% numa-meminfo.node1.MemUsed 100892 ± 3% -26.0% 74623 ± 19% numa-vmstat.node1.nr_slab_reclaimable 403571 ± 3% -26.0% 298513 ± 19% numa-meminfo.node1.SReclaimable 3525 ± 13% +36.6% 4817 ± 14% slabinfo.blkdev_requests.active_objs 3552 ± 13% +36.3% 4841 ± 14% slabinfo.blkdev_requests.num_objs 30779 ± 4% -34.7% 20084 ± 12% proc-vmstat.pgmigrate_success 30779 ± 4% -34.7% 20084 ± 12% proc-vmstat.numa_pages_migrated 447400 ± 2% -23.2% 343701 ± 16% numa-meminfo.node1.Slab 2.532e+10 ± 0% -33.1% 1.694e+10 ± 1% cpuidle.C6-IVT.time 3081 ± 9% +28.0% 3945 ± 12% slabinfo.mnt_cache.num_objs 3026 ± 9% +28.8% 3898 ± 12% slabinfo.mnt_cache.active_objs 5822 ± 4% +77.8% 10350 ± 25% numa-meminfo.node1.Writeback 1454 ± 4% +77.3% 2579 ± 25% numa-vmstat.node1.nr_writeback 424984 ± 1% -26.5% 312255 ± 3% proc-vmstat.numa_pte_updates 368001 ± 1% -26.8% 269440 ± 3% proc-vmstat.numa_hint_faults 456465 ± 1% -26.7% 334594 ± 4% time.minor_page_faults 3.86 ± 3% -24.4% 2.92 ± 2% turbostat.CPU%c3 4661151 ± 2% +20.6% 5622999 ± 9% numa-vmstat.node1.nr_free_pages 18644452 ± 2% +20.6% 22491300 ± 9% numa-meminfo.node1.MemFree 876 ± 2% +28.2% 1124 ± 5% slabinfo.kmalloc-4096.num_objs 858 ± 3% +24.0% 1064 ± 5% slabinfo.kmalloc-4096.active_objs 17767832 ± 8% -25.4% 13249545 ± 17% cpuidle.POLL.time 285093 ± 1% -23.1% 219372 ± 5% proc-vmstat.numa_hint_faults_local 105423 ± 2% -16.1% 88498 ± 0% meminfo.Dirty 26365 ± 1% -16.0% 22152 ± 1% proc-vmstat.nr_dirty 41.04 ± 1% -14.1% 35.26 ± 1% turbostat.CPU%c1 9385 ± 4% -14.3% 8043 ± 6% slabinfo.kmalloc-192.active_objs 9574 ± 3% -13.9% 8241 ± 6% slabinfo.kmalloc-192.num_objs 2411 ± 3% +17.0% 2820 ± 4% slabinfo.kmalloc-2048.active_objs 12595574 ± 0% -10.0% 11338368 ± 1% proc-vmstat.pgalloc_normal 5262 ± 1% +13.3% 5962 ± 1% slabinfo.kmalloc-1024.num_objs 5262 ± 1% +12.7% 5932 ± 1% slabinfo.kmalloc-1024.active_objs 2538 ± 3% +13.7% 2885 ± 4% slabinfo.kmalloc-2048.num_objs 5299546 ± 0% -9.9% 4776351 ± 0% slabinfo.buffer_head.active_objs 5299546 ± 0% -9.9% 4776351 ± 0% slabinfo.buffer_head.num_objs 135885 ± 0% -9.9% 122470 ± 0% slabinfo.buffer_head.num_slabs 135885 ± 0% -9.9% 122470 ± 0% slabinfo.buffer_head.active_slabs 28.04 ± 2% +715.6% 228.69 ± 3% iostat.sdb.avgrq-sz 28.05 ± 2% +708.1% 226.72 ± 2% iostat.sdc.avgrq-sz 2245 ± 3% -81.6% 413 ± 1% iostat.sda.w/s 5.33 ± 1% +1008.2% 59.07 ± 1% iostat.sda.w_await 5.85 ± 1% +1126.4% 71.69 ± 4% iostat.sda.r_await 5.36 ± 1% +978.6% 57.79 ± 3% iostat.sdc.w_await 1263 ± 4% -85.8% 179 ± 6% iostat.sdc.r/s 2257 ± 3% -81.6% 414 ± 2% iostat.sdb.w/s 1264 ± 4% -85.8% 179 ± 6% iostat.sdb.r/s 5.55 ± 0% +1024.2% 62.37 ± 4% iostat.sdb.await 5.89 ± 1% +1125.9% 72.16 ± 6% iostat.sdb.r_await 5.36 ± 0% +1014.3% 59.75 ± 3% iostat.sdb.w_await 5.57 ± 1% +987.9% 60.55 ± 3% iostat.sdc.await 5.51 ± 0% +1017.3% 61.58 ± 1% iostat.sda.await 1264 ± 4% -85.8% 179 ± 6% iostat.sda.r/s 28.09 ± 2% +714.2% 228.73 ± 2% iostat.sda.avgrq-sz 5.95 ± 2% +1091.0% 70.82 ± 6% iostat.sdc.r_await 2252 ± 3% -81.5% 417 ± 2% iostat.sdc.w/s 4032 ± 2% +151.6% 10143 ± 1% iostat.sdb.wrqm/s 4043 ± 2% +151.0% 10150 ± 1% iostat.sda.wrqm/s 4035 ± 2% +151.2% 10138 ± 1% iostat.sdc.wrqm/s 26252 ± 1% -54.0% 12077 ± 4% vmstat.system.in 37813 ± 0% +101.0% 75998 ± 1% vmstat.io.bo 37789 ± 0% +101.0% 75945 ± 1% iostat.md0.wkB/s 205 ± 0% +96.1% 402 ± 1% iostat.md0.w/s 164286 ± 1% -46.2% 88345 ± 2% vmstat.system.cs 27.07 ± 2% -46.7% 14.42 ± 3% turbostat.%Busy 810 ± 2% -46.7% 431 ± 3% turbostat.Avg_MHz 15.56 ± 2% +71.7% 26.71 ± 1% iostat.sda.avgqu-sz 15.65 ± 2% +69.1% 26.46 ± 2% iostat.sdc.avgqu-sz 15.67 ± 2% +72.7% 27.06 ± 2% iostat.sdb.avgqu-sz 25151 ± 0% +68.3% 42328 ± 1% iostat.sda.wkB/s 25153 ± 0% +68.2% 42305 ± 1% iostat.sdb.wkB/s 25149 ± 0% +68.2% 42292 ± 1% iostat.sdc.wkB/s 97.45 ± 0% -21.1% 76.90 ± 0% turbostat.CorWatt 12517 ± 0% -20.2% 9994 ± 1% iostat.sdc.rkB/s 12517 ± 0% -20.0% 10007 ± 1% iostat.sda.rkB/s 12512 ± 0% -19.9% 10018 ± 1% iostat.sdb.rkB/s 1863 ± 3% +24.7% 2325 ± 1% iostat.sdb.rrqm/s 1865 ± 3% +24.3% 2319 ± 1% iostat.sdc.rrqm/s 1864 ± 3% +24.6% 2322 ± 1% iostat.sda.rrqm/s 128 ± 0% -16.4% 107 ± 0% turbostat.PkgWatt 150569 ± 0% -8.7% 137525 ± 0% iostat.md0.avgqu-sz 4.29 ± 0% -5.1% 4.07 ± 0% turbostat.RAMWatt more detailed changes about ivb44/fsmark/1x-64t-4BRD_12G-RAID5-btrfs-4M-30G-NoSync --------- 26089f4902595a2f 4400755e356f9a2b0b7ceaa02f ---------------- -------------------------- %stddev %change %stddev \ | \ 273 ± 4% -18.1% 223 ± 6% fsmark.files_per_sec 29.24 ± 1% +27.2% 37.20 ± 8% fsmark.time.elapsed_time.max 29.24 ± 1% +27.2% 37.20 ± 8% fsmark.time.elapsed_time 399 ± 4% -20.0% 319 ± 3% fsmark.time.percent_of_cpu_this_job_got 129891 ± 20% -28.9% 92334 ± 15% fsmark.time.voluntary_context_switches 266 ± 0% +413.4% 1365 ± 5% slabinfo.raid5-md0.num_objs 266 ± 0% +413.4% 1365 ± 5% slabinfo.raid5-md0.active_objs 0.23 ± 27% +98.6% 0.46 ± 35% turbostat.CPU%c3 56612063 ± 9% +36.7% 77369763 ± 20% cpuidle.C1-IVT.time 5579498 ± 14% -36.0% 3571516 ± 6% cpuidle.C1E-IVT.time 4668 ± 38% +64.7% 7690 ± 19% numa-vmstat.node0.nr_unevictable 18674 ± 38% +64.7% 30762 ± 19% numa-meminfo.node0.Unevictable 9298 ± 37% +64.4% 15286 ± 19% proc-vmstat.nr_unevictable 4629 ± 37% +64.1% 7596 ± 19% numa-vmstat.node1.nr_unevictable 18535 ± 37% +63.9% 30385 ± 19% numa-meminfo.node1.Unevictable 4270894 ± 19% +65.6% 7070923 ± 21% cpuidle.C3-IVT.time 38457 ± 37% +59.0% 61148 ± 19% meminfo.Unevictable 3748226 ± 17% +26.6% 4743674 ± 16% numa-vmstat.node0.numa_local 4495283 ± 13% -24.8% 3382315 ± 17% numa-vmstat.node0.nr_free_pages 3818432 ± 16% +26.5% 4830938 ± 16% numa-vmstat.node0.numa_hit 17966826 ± 13% -24.7% 13537228 ± 17% numa-meminfo.node0.MemFree 14901309 ± 15% +29.7% 19330906 ± 12% numa-meminfo.node0.MemUsed 26 ± 21% -32.9% 17 ± 14% cpuidle.POLL.usage 1.183e+09 ± 1% +29.6% 1.533e+09 ± 8% cpuidle.C6-IVT.time 29.24 ± 1% +27.2% 37.20 ± 8% time.elapsed_time 29.24 ± 1% +27.2% 37.20 ± 8% time.elapsed_time.max 399 ± 4% -20.0% 319 ± 3% time.percent_of_cpu_this_job_got 850 ± 4% -8.6% 777 ± 5% slabinfo.blkdev_requests.num_objs 850 ± 4% -8.6% 777 ± 5% slabinfo.blkdev_requests.active_objs 14986 ± 9% +17.1% 17548 ± 8% numa-vmstat.node0.nr_slab_reclaimable 11943 ± 5% -12.6% 10441 ± 2% slabinfo.kmalloc-192.num_objs 59986 ± 9% +17.0% 70186 ± 8% numa-meminfo.node0.SReclaimable 3703 ± 6% +10.2% 4082 ± 7% slabinfo.btrfs_delayed_data_ref.num_objs 133551 ± 6% +16.1% 154995 ± 1% proc-vmstat.pgfault 129891 ± 20% -28.9% 92334 ± 15% time.voluntary_context_switches 11823 ± 4% -12.0% 10409 ± 3% slabinfo.kmalloc-192.active_objs 3703 ± 6% +9.7% 4061 ± 7% slabinfo.btrfs_delayed_data_ref.active_objs 19761 ± 2% -11.2% 17542 ± 6% slabinfo.anon_vma.active_objs 19761 ± 2% -11.2% 17544 ± 6% slabinfo.anon_vma.num_objs 13002 ± 3% +14.9% 14944 ± 5% slabinfo.kmalloc-256.num_objs 12695 ± 3% +13.8% 14446 ± 7% slabinfo.kmalloc-256.active_objs 1190 ± 1% -11.8% 1050 ± 3% slabinfo.mnt_cache.num_objs 1190 ± 1% -11.8% 1050 ± 3% slabinfo.mnt_cache.active_objs 136862 ± 1% -13.8% 117938 ± 7% cpuidle.C6-IVT.usage 1692630 ± 3% +12.3% 1900854 ± 0% numa-vmstat.node0.nr_written 1056 ± 2% +8.8% 1149 ± 3% slabinfo.mm_struct.active_objs 1056 ± 2% +8.8% 1149 ± 3% slabinfo.mm_struct.num_objs 24029 ± 11% -30.6% 16673 ± 8% vmstat.system.cs 8859 ± 2% -15.0% 7530 ± 8% vmstat.system.in 905630 ± 2% -16.8% 753097 ± 4% iostat.md0.wkB/s 906433 ± 2% -16.9% 753482 ± 4% vmstat.io.bo 3591 ± 2% -16.9% 2982 ± 4% iostat.md0.w/s 13.22 ± 5% -16.3% 11.07 ± 1% turbostat.%Busy 402 ± 4% -15.9% 338 ± 1% turbostat.Avg_MHz 54236 ± 3% +10.4% 59889 ± 4% iostat.md0.avgqu-sz 7.67 ± 1% +4.5% 8.01 ± 1% turbostat.RAMWatt --yliu ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: performance changes on 4400755e: 200.0% fsmark.files_per_sec, -18.1% fsmark.files_per_sec, and few more 2015-03-18 5:00 ` Yuanahn Liu @ 2015-03-25 3:03 ` NeilBrown -1 siblings, 0 replies; 6+ messages in thread From: NeilBrown @ 2015-03-25 3:03 UTC (permalink / raw) To: lkp [-- Attachment #1: Type: text/plain, Size: 21121 bytes --] On Wed, 18 Mar 2015 13:00:30 +0800 Yuanahn Liu <yuanhan.liu@linux.intel.com> wrote: > Hi, > > FYI, we noticed performance changes on `fsmark.files_per_sec' by 4400755e356f9a2b0b7ceaa02f57b1c7546c3765: > > > commit 4400755e356f9a2b0b7ceaa02f57b1c7546c3765 > > Author: NeilBrown <neilb@suse.de> > > AuthorDate: Thu Feb 26 12:47:56 2015 +1100 > > Commit: NeilBrown <neilb@suse.de> > > CommitDate: Wed Mar 4 13:40:19 2015 +1100 > > > > md/raid5: allow the stripe_cache to grow and shrink. Thanks a lot for this testing!!! I was wondering how I could do some proper testing of this patch, and you've done it for me :-) The large number of improvements is very encouraging - that is what I was hoping for of course. The few regressions could be a concern. I note that are all NoSync. That seems to suggest that they could just be writing more data. i.e. the data is written a bit earlier (certainly possible) so it happen to introduce more delay .... I guess I'm not really sure how to interpret NoSync results, and suspect that poor NoSync result don't really reflect much on the underlying block device. Could that be right? Also, I'm a little confused by the fsmark.time.involuntary_context_switches statistic: > 1235 ± 2% -47.5% 649 ± 3% fsmark.time.percent_of_cpu_this_job_got > 399 ± 4% -20.0% 319 ± 3% fsmark.time.percent_of_cpu_this_job_got Does that means that the ext4 test changed from 12.4 cpus to 6.4, and that the btrfs test chnages from 4 cpus to 3.2 ??? Or does it just not mean anything? Thanks, NeilBrown > > 26089f4902595a2f64c512066af07af6e82eb096 4400755e356f9a2b0b7ceaa02f57b1c7546c3765 > ---------------------------------------- ---------------------------------------- > run time(m) metric_value ±stddev run time(m) metric_value ±stddev change testbox/benchmark/sub-testcase > --- ------ ---------------------------- --- ------ ---------------------------- -------- ------------------------------ > 3 18.6 6.400 ±0.0% 5 9.2 19.200 ±0.0% 200.0% ivb44/fsmark/1x-64t-3HDD-RAID5-ext4-4M-40G-fsyncBeforeClose > 3 24.7 6.400 ±0.0% 3 13.7 12.800 ±0.0% 100.0% ivb44/fsmark/1x-64t-3HDD-RAID5-f2fs-4M-40G-fsyncBeforeClose > 3 17.5 28.267 ±9.6% 3 12.3 42.833 ±6.5% 51.5% ivb44/fsmark/1x-64t-3HDD-RAID5-f2fs-4M-120G-NoSync > 3 16.7 30.700 ±1.5% 3 12.6 40.733 ±2.4% 32.7% ivb44/fsmark/1x-1t-3HDD-RAID5-f2fs-4M-120G-NoSync > 3 29.0 5.867 ±0.8% 5 23.6 7.240 ±0.7% 23.4% ivb44/fsmark/1x-1t-3HDD-RAID5-btrfs-4M-40G-fsyncBeforeClose > 3 28.5 6.000 ±0.0% 3 23.2 7.367 ±0.6% 22.8% ivb44/fsmark/1x-1t-3HDD-RAID5-f2fs-4M-40G-fsyncBeforeClose > 5 11.7 14.600 ±0.0% 5 9.7 17.500 ±0.4% 19.9% ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-40G-fsyncBeforeClose > 3 22.4 25.600 ±0.0% 5 17.9 30.120 ±4.1% 17.7% ivb44/fsmark/1x-64t-3HDD-RAID5-xfs-4M-120G-NoSync > 5 10.8 47.320 ±0.6% 5 9.3 54.820 ±0.2% 15.8% ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-120G-NoSync > 1 0.5 252.400 ±0.0% 1 0.5 263.300 ±0.0% 4.3% ivb44/fsmark/1x-1t-4BRD_12G-RAID5-ext4-4M-30G-NoSync > > 3 0.5 273.100 ±4.3% 3 0.6 223.567 ±6.5% -18.1% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-btrfs-4M-30G-NoSync > 3 8.1 63.133 ±0.5% 3 9.2 55.633 ±0.2% -11.9% ivb44/fsmark/1x-1t-3HDD-RAID5-btrfs-4M-120G-NoSync > 3 8.2 64.000 ±0.0% 3 9.2 57.600 ±0.0% -10.0% ivb44/fsmark/1x-64t-3HDD-RAID5-btrfs-4M-120G-NoSync > > > NOTE: here are some more info about those test parameters for you to > understand the testcase better: > > 1x: where 'x' means iterations or loop, corresponding to the 'L' option of fsmark > 1t, 64t: where 't' means thread > 4M: means the single file size, corresponding to the '-s' option of fsmark > 40G, 30G, 120G: means the total test size > > 4BRD_12G: BRD is the ramdisk, where '4' means 4 ramdisk, and where '12G' means > the size of one ramdisk. So, it would be 48G in total. And we made a > raid on those ramdisk. > > > As you can see from above data, interestingly, all performance > regressions come from btrfs testing. That's why Chris is also > in the cc list, with which just FYI. > > > FYI, here I listed more detailed changes for the maximal postive and negtive changes. > > more detailed changes about ivb44/fsmark/1x-64t-3HDD-RAID5-ext4-4M-40G-fsyncBeforeClose > --------- > > 26089f4902595a2f 4400755e356f9a2b0b7ceaa02f > ---------------- -------------------------- > %stddev %change %stddev > \ | \ > 6.40 ± 0% +200.0% 19.20 ± 0% fsmark.files_per_sec > 1.015e+08 ± 1% -73.6% 26767355 ± 3% fsmark.time.voluntary_context_switches > 13793 ± 1% -73.9% 3603 ± 5% fsmark.time.system_time > 78473 ± 6% -64.3% 28016 ± 7% fsmark.time.involuntary_context_switches > 15789555 ± 9% -54.7% 7159485 ± 13% fsmark.app_overhead > 1115 ± 0% -50.3% 554 ± 1% fsmark.time.elapsed_time.max > 1115 ± 0% -50.3% 554 ± 1% fsmark.time.elapsed_time > 1235 ± 2% -47.5% 649 ± 3% fsmark.time.percent_of_cpu_this_job_got > 456465 ± 1% -26.7% 334594 ± 4% fsmark.time.minor_page_faults > 275 ± 0% +1257.7% 3733 ± 2% slabinfo.raid5-md0.num_objs > 275 ± 0% +1257.7% 3733 ± 2% slabinfo.raid5-md0.active_objs > 11 ± 0% +1250.9% 148 ± 2% slabinfo.raid5-md0.active_slabs > 11 ± 0% +1250.9% 148 ± 2% slabinfo.raid5-md0.num_slabs > 2407 ± 4% +293.4% 9471 ± 26% numa-meminfo.node0.Writeback > 600 ± 4% +294.9% 2372 ± 26% numa-vmstat.node0.nr_writeback > 1114505 ± 0% -77.4% 251696 ± 2% softirqs.TASKLET > 1808027 ± 1% -77.7% 402378 ± 4% softirqs.RCU > 12158665 ± 1% -77.1% 2786069 ± 4% cpuidle.C3-IVT.usage > 1119433 ± 0% -77.3% 254192 ± 2% softirqs.BLOCK > 37824202 ± 1% -75.1% 9405078 ± 4% cpuidle.C6-IVT.usage > 1.015e+08 ± 1% -73.6% 26767355 ± 3% time.voluntary_context_switches > 13793 ± 1% -73.9% 3603 ± 5% time.system_time > 5971084 ± 1% -73.6% 1574912 ± 5% softirqs.SCHED > 10539492 ± 3% -72.0% 2956258 ± 6% cpuidle.C1E-IVT.usage > 2 ± 0% +230.0% 6 ± 12% vmstat.procs.b > 14064 ± 1% -71.2% 4049 ± 6% softirqs.HRTIMER > 7388306 ± 1% -71.2% 2129929 ± 4% softirqs.TIMER > 3.496e+09 ± 1% -70.3% 1.04e+09 ± 1% cpuidle.C3-IVT.time > 0.88 ± 6% +224.9% 2.87 ± 11% turbostat.Pkg%pc6 > 19969464 ± 2% -66.2% 6750675 ± 5% cpuidle.C1-IVT.usage > 78473 ± 6% -64.3% 28016 ± 7% time.involuntary_context_switches > 4.23 ± 5% +181.4% 11.90 ± 3% turbostat.Pkg%pc2 > 2.551e+09 ± 1% -61.4% 9.837e+08 ± 3% cpuidle.C1E-IVT.time > 8084 ± 3% +142.6% 19608 ± 3% meminfo.Writeback > 2026 ± 4% +141.6% 4895 ± 4% proc-vmstat.nr_writeback > 165 ± 4% -56.9% 71 ± 14% numa-vmstat.node1.nr_inactive_anon > 7.748e+09 ± 3% -50.3% 3.852e+09 ± 3% cpuidle.C1-IVT.time > 175 ± 5% -53.2% 82 ± 13% numa-vmstat.node1.nr_shmem > 1115 ± 0% -50.3% 554 ± 1% time.elapsed_time.max > 1115 ± 0% -50.3% 554 ± 1% time.elapsed_time > 1147 ± 0% -49.0% 585 ± 1% uptime.boot > 2260889 ± 0% -48.8% 1157272 ± 1% proc-vmstat.pgfree > 16805 ± 2% -35.9% 10776 ± 23% numa-vmstat.node1.nr_dirty > 1235 ± 2% -47.5% 649 ± 3% time.percent_of_cpu_this_job_got > 67245 ± 2% -35.9% 43122 ± 23% numa-meminfo.node1.Dirty > 39041 ± 0% -45.7% 21212 ± 2% uptime.idle > 13 ± 9% -49.0% 6 ± 11% vmstat.procs.r > 3072 ± 10% -40.3% 1833 ± 9% cpuidle.POLL.usage > 3045115 ± 0% -46.1% 1642053 ± 1% proc-vmstat.pgfault > 202 ± 1% -45.2% 110 ± 0% proc-vmstat.nr_inactive_anon > 4583079 ± 2% -31.4% 3143602 ± 16% numa-vmstat.node1.numa_hit > 28.03 ± 0% +69.1% 47.39 ± 1% turbostat.CPU%c6 > 223 ± 1% -41.1% 131 ± 1% proc-vmstat.nr_shmem > 4518820 ± 3% -30.8% 3128304 ± 16% numa-vmstat.node1.numa_local > 3363496 ± 3% -27.4% 2441619 ± 20% numa-vmstat.node1.nr_dirtied > 3345346 ± 3% -27.4% 2428396 ± 20% numa-vmstat.node1.nr_written > 0.18 ± 18% +105.6% 0.37 ± 36% turbostat.Pkg%pc3 > 3427913 ± 3% -27.3% 2492563 ± 20% numa-vmstat.node1.nr_inactive_file > 13712431 ± 3% -27.3% 9971152 ± 20% numa-meminfo.node1.Inactive > 13711768 ± 3% -27.3% 9970866 ± 20% numa-meminfo.node1.Inactive(file) > 3444598 ± 3% -27.2% 2508920 ± 20% numa-vmstat.node1.nr_file_pages > 13778510 ± 3% -27.2% 10036287 ± 20% numa-meminfo.node1.FilePages > 8819175 ± 1% -28.3% 6320188 ± 19% numa-numastat.node1.numa_hit > 8819051 ± 1% -28.3% 6320152 ± 19% numa-numastat.node1.local_node > 14350918 ± 3% -26.8% 10504070 ± 19% numa-meminfo.node1.MemUsed > 100892 ± 3% -26.0% 74623 ± 19% numa-vmstat.node1.nr_slab_reclaimable > 403571 ± 3% -26.0% 298513 ± 19% numa-meminfo.node1.SReclaimable > 3525 ± 13% +36.6% 4817 ± 14% slabinfo.blkdev_requests.active_objs > 3552 ± 13% +36.3% 4841 ± 14% slabinfo.blkdev_requests.num_objs > 30779 ± 4% -34.7% 20084 ± 12% proc-vmstat.pgmigrate_success > 30779 ± 4% -34.7% 20084 ± 12% proc-vmstat.numa_pages_migrated > 447400 ± 2% -23.2% 343701 ± 16% numa-meminfo.node1.Slab > 2.532e+10 ± 0% -33.1% 1.694e+10 ± 1% cpuidle.C6-IVT.time > 3081 ± 9% +28.0% 3945 ± 12% slabinfo.mnt_cache.num_objs > 3026 ± 9% +28.8% 3898 ± 12% slabinfo.mnt_cache.active_objs > 5822 ± 4% +77.8% 10350 ± 25% numa-meminfo.node1.Writeback > 1454 ± 4% +77.3% 2579 ± 25% numa-vmstat.node1.nr_writeback > 424984 ± 1% -26.5% 312255 ± 3% proc-vmstat.numa_pte_updates > 368001 ± 1% -26.8% 269440 ± 3% proc-vmstat.numa_hint_faults > 456465 ± 1% -26.7% 334594 ± 4% time.minor_page_faults > 3.86 ± 3% -24.4% 2.92 ± 2% turbostat.CPU%c3 > 4661151 ± 2% +20.6% 5622999 ± 9% numa-vmstat.node1.nr_free_pages > 18644452 ± 2% +20.6% 22491300 ± 9% numa-meminfo.node1.MemFree > 876 ± 2% +28.2% 1124 ± 5% slabinfo.kmalloc-4096.num_objs > 858 ± 3% +24.0% 1064 ± 5% slabinfo.kmalloc-4096.active_objs > 17767832 ± 8% -25.4% 13249545 ± 17% cpuidle.POLL.time > 285093 ± 1% -23.1% 219372 ± 5% proc-vmstat.numa_hint_faults_local > 105423 ± 2% -16.1% 88498 ± 0% meminfo.Dirty > 26365 ± 1% -16.0% 22152 ± 1% proc-vmstat.nr_dirty > 41.04 ± 1% -14.1% 35.26 ± 1% turbostat.CPU%c1 > 9385 ± 4% -14.3% 8043 ± 6% slabinfo.kmalloc-192.active_objs > 9574 ± 3% -13.9% 8241 ± 6% slabinfo.kmalloc-192.num_objs > 2411 ± 3% +17.0% 2820 ± 4% slabinfo.kmalloc-2048.active_objs > 12595574 ± 0% -10.0% 11338368 ± 1% proc-vmstat.pgalloc_normal > 5262 ± 1% +13.3% 5962 ± 1% slabinfo.kmalloc-1024.num_objs > 5262 ± 1% +12.7% 5932 ± 1% slabinfo.kmalloc-1024.active_objs > 2538 ± 3% +13.7% 2885 ± 4% slabinfo.kmalloc-2048.num_objs > 5299546 ± 0% -9.9% 4776351 ± 0% slabinfo.buffer_head.active_objs > 5299546 ± 0% -9.9% 4776351 ± 0% slabinfo.buffer_head.num_objs > 135885 ± 0% -9.9% 122470 ± 0% slabinfo.buffer_head.num_slabs > 135885 ± 0% -9.9% 122470 ± 0% slabinfo.buffer_head.active_slabs > 28.04 ± 2% +715.6% 228.69 ± 3% iostat.sdb.avgrq-sz > 28.05 ± 2% +708.1% 226.72 ± 2% iostat.sdc.avgrq-sz > 2245 ± 3% -81.6% 413 ± 1% iostat.sda.w/s > 5.33 ± 1% +1008.2% 59.07 ± 1% iostat.sda.w_await > 5.85 ± 1% +1126.4% 71.69 ± 4% iostat.sda.r_await > 5.36 ± 1% +978.6% 57.79 ± 3% iostat.sdc.w_await > 1263 ± 4% -85.8% 179 ± 6% iostat.sdc.r/s > 2257 ± 3% -81.6% 414 ± 2% iostat.sdb.w/s > 1264 ± 4% -85.8% 179 ± 6% iostat.sdb.r/s > 5.55 ± 0% +1024.2% 62.37 ± 4% iostat.sdb.await > 5.89 ± 1% +1125.9% 72.16 ± 6% iostat.sdb.r_await > 5.36 ± 0% +1014.3% 59.75 ± 3% iostat.sdb.w_await > 5.57 ± 1% +987.9% 60.55 ± 3% iostat.sdc.await > 5.51 ± 0% +1017.3% 61.58 ± 1% iostat.sda.await > 1264 ± 4% -85.8% 179 ± 6% iostat.sda.r/s > 28.09 ± 2% +714.2% 228.73 ± 2% iostat.sda.avgrq-sz > 5.95 ± 2% +1091.0% 70.82 ± 6% iostat.sdc.r_await > 2252 ± 3% -81.5% 417 ± 2% iostat.sdc.w/s > 4032 ± 2% +151.6% 10143 ± 1% iostat.sdb.wrqm/s > 4043 ± 2% +151.0% 10150 ± 1% iostat.sda.wrqm/s > 4035 ± 2% +151.2% 10138 ± 1% iostat.sdc.wrqm/s > 26252 ± 1% -54.0% 12077 ± 4% vmstat.system.in > 37813 ± 0% +101.0% 75998 ± 1% vmstat.io.bo > 37789 ± 0% +101.0% 75945 ± 1% iostat.md0.wkB/s > 205 ± 0% +96.1% 402 ± 1% iostat.md0.w/s > 164286 ± 1% -46.2% 88345 ± 2% vmstat.system.cs > 27.07 ± 2% -46.7% 14.42 ± 3% turbostat.%Busy > 810 ± 2% -46.7% 431 ± 3% turbostat.Avg_MHz > 15.56 ± 2% +71.7% 26.71 ± 1% iostat.sda.avgqu-sz > 15.65 ± 2% +69.1% 26.46 ± 2% iostat.sdc.avgqu-sz > 15.67 ± 2% +72.7% 27.06 ± 2% iostat.sdb.avgqu-sz > 25151 ± 0% +68.3% 42328 ± 1% iostat.sda.wkB/s > 25153 ± 0% +68.2% 42305 ± 1% iostat.sdb.wkB/s > 25149 ± 0% +68.2% 42292 ± 1% iostat.sdc.wkB/s > 97.45 ± 0% -21.1% 76.90 ± 0% turbostat.CorWatt > 12517 ± 0% -20.2% 9994 ± 1% iostat.sdc.rkB/s > 12517 ± 0% -20.0% 10007 ± 1% iostat.sda.rkB/s > 12512 ± 0% -19.9% 10018 ± 1% iostat.sdb.rkB/s > 1863 ± 3% +24.7% 2325 ± 1% iostat.sdb.rrqm/s > 1865 ± 3% +24.3% 2319 ± 1% iostat.sdc.rrqm/s > 1864 ± 3% +24.6% 2322 ± 1% iostat.sda.rrqm/s > 128 ± 0% -16.4% 107 ± 0% turbostat.PkgWatt > 150569 ± 0% -8.7% 137525 ± 0% iostat.md0.avgqu-sz > 4.29 ± 0% -5.1% 4.07 ± 0% turbostat.RAMWatt > > > more detailed changes about ivb44/fsmark/1x-64t-4BRD_12G-RAID5-btrfs-4M-30G-NoSync > --------- > > 26089f4902595a2f 4400755e356f9a2b0b7ceaa02f > ---------------- -------------------------- > %stddev %change %stddev > \ | \ > 273 ± 4% -18.1% 223 ± 6% fsmark.files_per_sec > 29.24 ± 1% +27.2% 37.20 ± 8% fsmark.time.elapsed_time.max > 29.24 ± 1% +27.2% 37.20 ± 8% fsmark.time.elapsed_time > 399 ± 4% -20.0% 319 ± 3% fsmark.time.percent_of_cpu_this_job_got > 129891 ± 20% -28.9% 92334 ± 15% fsmark.time.voluntary_context_switches > 266 ± 0% +413.4% 1365 ± 5% slabinfo.raid5-md0.num_objs > 266 ± 0% +413.4% 1365 ± 5% slabinfo.raid5-md0.active_objs > 0.23 ± 27% +98.6% 0.46 ± 35% turbostat.CPU%c3 > 56612063 ± 9% +36.7% 77369763 ± 20% cpuidle.C1-IVT.time > 5579498 ± 14% -36.0% 3571516 ± 6% cpuidle.C1E-IVT.time > 4668 ± 38% +64.7% 7690 ± 19% numa-vmstat.node0.nr_unevictable > 18674 ± 38% +64.7% 30762 ± 19% numa-meminfo.node0.Unevictable > 9298 ± 37% +64.4% 15286 ± 19% proc-vmstat.nr_unevictable > 4629 ± 37% +64.1% 7596 ± 19% numa-vmstat.node1.nr_unevictable > 18535 ± 37% +63.9% 30385 ± 19% numa-meminfo.node1.Unevictable > 4270894 ± 19% +65.6% 7070923 ± 21% cpuidle.C3-IVT.time > 38457 ± 37% +59.0% 61148 ± 19% meminfo.Unevictable > 3748226 ± 17% +26.6% 4743674 ± 16% numa-vmstat.node0.numa_local > 4495283 ± 13% -24.8% 3382315 ± 17% numa-vmstat.node0.nr_free_pages > 3818432 ± 16% +26.5% 4830938 ± 16% numa-vmstat.node0.numa_hit > 17966826 ± 13% -24.7% 13537228 ± 17% numa-meminfo.node0.MemFree > 14901309 ± 15% +29.7% 19330906 ± 12% numa-meminfo.node0.MemUsed > 26 ± 21% -32.9% 17 ± 14% cpuidle.POLL.usage > 1.183e+09 ± 1% +29.6% 1.533e+09 ± 8% cpuidle.C6-IVT.time > 29.24 ± 1% +27.2% 37.20 ± 8% time.elapsed_time > 29.24 ± 1% +27.2% 37.20 ± 8% time.elapsed_time.max > 399 ± 4% -20.0% 319 ± 3% time.percent_of_cpu_this_job_got > 850 ± 4% -8.6% 777 ± 5% slabinfo.blkdev_requests.num_objs > 850 ± 4% -8.6% 777 ± 5% slabinfo.blkdev_requests.active_objs > 14986 ± 9% +17.1% 17548 ± 8% numa-vmstat.node0.nr_slab_reclaimable > 11943 ± 5% -12.6% 10441 ± 2% slabinfo.kmalloc-192.num_objs > 59986 ± 9% +17.0% 70186 ± 8% numa-meminfo.node0.SReclaimable > 3703 ± 6% +10.2% 4082 ± 7% slabinfo.btrfs_delayed_data_ref.num_objs > 133551 ± 6% +16.1% 154995 ± 1% proc-vmstat.pgfault > 129891 ± 20% -28.9% 92334 ± 15% time.voluntary_context_switches > 11823 ± 4% -12.0% 10409 ± 3% slabinfo.kmalloc-192.active_objs > 3703 ± 6% +9.7% 4061 ± 7% slabinfo.btrfs_delayed_data_ref.active_objs > 19761 ± 2% -11.2% 17542 ± 6% slabinfo.anon_vma.active_objs > 19761 ± 2% -11.2% 17544 ± 6% slabinfo.anon_vma.num_objs > 13002 ± 3% +14.9% 14944 ± 5% slabinfo.kmalloc-256.num_objs > 12695 ± 3% +13.8% 14446 ± 7% slabinfo.kmalloc-256.active_objs > 1190 ± 1% -11.8% 1050 ± 3% slabinfo.mnt_cache.num_objs > 1190 ± 1% -11.8% 1050 ± 3% slabinfo.mnt_cache.active_objs > 136862 ± 1% -13.8% 117938 ± 7% cpuidle.C6-IVT.usage > 1692630 ± 3% +12.3% 1900854 ± 0% numa-vmstat.node0.nr_written > 1056 ± 2% +8.8% 1149 ± 3% slabinfo.mm_struct.active_objs > 1056 ± 2% +8.8% 1149 ± 3% slabinfo.mm_struct.num_objs > 24029 ± 11% -30.6% 16673 ± 8% vmstat.system.cs > 8859 ± 2% -15.0% 7530 ± 8% vmstat.system.in > 905630 ± 2% -16.8% 753097 ± 4% iostat.md0.wkB/s > 906433 ± 2% -16.9% 753482 ± 4% vmstat.io.bo > 3591 ± 2% -16.9% 2982 ± 4% iostat.md0.w/s > 13.22 ± 5% -16.3% 11.07 ± 1% turbostat.%Busy > 402 ± 4% -15.9% 338 ± 1% turbostat.Avg_MHz > 54236 ± 3% +10.4% 59889 ± 4% iostat.md0.avgqu-sz > 7.67 ± 1% +4.5% 8.01 ± 1% turbostat.RAMWatt > > > > --yliu > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo(a)vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ [-- Attachment #2: attachment.sig --] [-- Type: application/pgp-signature, Size: 811 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: performance changes on 4400755e: 200.0% fsmark.files_per_sec, -18.1% fsmark.files_per_sec, and few more @ 2015-03-25 3:03 ` NeilBrown 0 siblings, 0 replies; 6+ messages in thread From: NeilBrown @ 2015-03-25 3:03 UTC (permalink / raw) To: Yuanahn Liu; +Cc: lkp, lkp, LKML, Chris Mason [-- Attachment #1: Type: text/plain, Size: 21119 bytes --] On Wed, 18 Mar 2015 13:00:30 +0800 Yuanahn Liu <yuanhan.liu@linux.intel.com> wrote: > Hi, > > FYI, we noticed performance changes on `fsmark.files_per_sec' by 4400755e356f9a2b0b7ceaa02f57b1c7546c3765: > > > commit 4400755e356f9a2b0b7ceaa02f57b1c7546c3765 > > Author: NeilBrown <neilb@suse.de> > > AuthorDate: Thu Feb 26 12:47:56 2015 +1100 > > Commit: NeilBrown <neilb@suse.de> > > CommitDate: Wed Mar 4 13:40:19 2015 +1100 > > > > md/raid5: allow the stripe_cache to grow and shrink. Thanks a lot for this testing!!! I was wondering how I could do some proper testing of this patch, and you've done it for me :-) The large number of improvements is very encouraging - that is what I was hoping for of course. The few regressions could be a concern. I note that are all NoSync. That seems to suggest that they could just be writing more data. i.e. the data is written a bit earlier (certainly possible) so it happen to introduce more delay .... I guess I'm not really sure how to interpret NoSync results, and suspect that poor NoSync result don't really reflect much on the underlying block device. Could that be right? Also, I'm a little confused by the fsmark.time.involuntary_context_switches statistic: > 1235 ± 2% -47.5% 649 ± 3% fsmark.time.percent_of_cpu_this_job_got > 399 ± 4% -20.0% 319 ± 3% fsmark.time.percent_of_cpu_this_job_got Does that means that the ext4 test changed from 12.4 cpus to 6.4, and that the btrfs test chnages from 4 cpus to 3.2 ??? Or does it just not mean anything? Thanks, NeilBrown > > 26089f4902595a2f64c512066af07af6e82eb096 4400755e356f9a2b0b7ceaa02f57b1c7546c3765 > ---------------------------------------- ---------------------------------------- > run time(m) metric_value ±stddev run time(m) metric_value ±stddev change testbox/benchmark/sub-testcase > --- ------ ---------------------------- --- ------ ---------------------------- -------- ------------------------------ > 3 18.6 6.400 ±0.0% 5 9.2 19.200 ±0.0% 200.0% ivb44/fsmark/1x-64t-3HDD-RAID5-ext4-4M-40G-fsyncBeforeClose > 3 24.7 6.400 ±0.0% 3 13.7 12.800 ±0.0% 100.0% ivb44/fsmark/1x-64t-3HDD-RAID5-f2fs-4M-40G-fsyncBeforeClose > 3 17.5 28.267 ±9.6% 3 12.3 42.833 ±6.5% 51.5% ivb44/fsmark/1x-64t-3HDD-RAID5-f2fs-4M-120G-NoSync > 3 16.7 30.700 ±1.5% 3 12.6 40.733 ±2.4% 32.7% ivb44/fsmark/1x-1t-3HDD-RAID5-f2fs-4M-120G-NoSync > 3 29.0 5.867 ±0.8% 5 23.6 7.240 ±0.7% 23.4% ivb44/fsmark/1x-1t-3HDD-RAID5-btrfs-4M-40G-fsyncBeforeClose > 3 28.5 6.000 ±0.0% 3 23.2 7.367 ±0.6% 22.8% ivb44/fsmark/1x-1t-3HDD-RAID5-f2fs-4M-40G-fsyncBeforeClose > 5 11.7 14.600 ±0.0% 5 9.7 17.500 ±0.4% 19.9% ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-40G-fsyncBeforeClose > 3 22.4 25.600 ±0.0% 5 17.9 30.120 ±4.1% 17.7% ivb44/fsmark/1x-64t-3HDD-RAID5-xfs-4M-120G-NoSync > 5 10.8 47.320 ±0.6% 5 9.3 54.820 ±0.2% 15.8% ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-120G-NoSync > 1 0.5 252.400 ±0.0% 1 0.5 263.300 ±0.0% 4.3% ivb44/fsmark/1x-1t-4BRD_12G-RAID5-ext4-4M-30G-NoSync > > 3 0.5 273.100 ±4.3% 3 0.6 223.567 ±6.5% -18.1% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-btrfs-4M-30G-NoSync > 3 8.1 63.133 ±0.5% 3 9.2 55.633 ±0.2% -11.9% ivb44/fsmark/1x-1t-3HDD-RAID5-btrfs-4M-120G-NoSync > 3 8.2 64.000 ±0.0% 3 9.2 57.600 ±0.0% -10.0% ivb44/fsmark/1x-64t-3HDD-RAID5-btrfs-4M-120G-NoSync > > > NOTE: here are some more info about those test parameters for you to > understand the testcase better: > > 1x: where 'x' means iterations or loop, corresponding to the 'L' option of fsmark > 1t, 64t: where 't' means thread > 4M: means the single file size, corresponding to the '-s' option of fsmark > 40G, 30G, 120G: means the total test size > > 4BRD_12G: BRD is the ramdisk, where '4' means 4 ramdisk, and where '12G' means > the size of one ramdisk. So, it would be 48G in total. And we made a > raid on those ramdisk. > > > As you can see from above data, interestingly, all performance > regressions come from btrfs testing. That's why Chris is also > in the cc list, with which just FYI. > > > FYI, here I listed more detailed changes for the maximal postive and negtive changes. > > more detailed changes about ivb44/fsmark/1x-64t-3HDD-RAID5-ext4-4M-40G-fsyncBeforeClose > --------- > > 26089f4902595a2f 4400755e356f9a2b0b7ceaa02f > ---------------- -------------------------- > %stddev %change %stddev > \ | \ > 6.40 ± 0% +200.0% 19.20 ± 0% fsmark.files_per_sec > 1.015e+08 ± 1% -73.6% 26767355 ± 3% fsmark.time.voluntary_context_switches > 13793 ± 1% -73.9% 3603 ± 5% fsmark.time.system_time > 78473 ± 6% -64.3% 28016 ± 7% fsmark.time.involuntary_context_switches > 15789555 ± 9% -54.7% 7159485 ± 13% fsmark.app_overhead > 1115 ± 0% -50.3% 554 ± 1% fsmark.time.elapsed_time.max > 1115 ± 0% -50.3% 554 ± 1% fsmark.time.elapsed_time > 1235 ± 2% -47.5% 649 ± 3% fsmark.time.percent_of_cpu_this_job_got > 456465 ± 1% -26.7% 334594 ± 4% fsmark.time.minor_page_faults > 275 ± 0% +1257.7% 3733 ± 2% slabinfo.raid5-md0.num_objs > 275 ± 0% +1257.7% 3733 ± 2% slabinfo.raid5-md0.active_objs > 11 ± 0% +1250.9% 148 ± 2% slabinfo.raid5-md0.active_slabs > 11 ± 0% +1250.9% 148 ± 2% slabinfo.raid5-md0.num_slabs > 2407 ± 4% +293.4% 9471 ± 26% numa-meminfo.node0.Writeback > 600 ± 4% +294.9% 2372 ± 26% numa-vmstat.node0.nr_writeback > 1114505 ± 0% -77.4% 251696 ± 2% softirqs.TASKLET > 1808027 ± 1% -77.7% 402378 ± 4% softirqs.RCU > 12158665 ± 1% -77.1% 2786069 ± 4% cpuidle.C3-IVT.usage > 1119433 ± 0% -77.3% 254192 ± 2% softirqs.BLOCK > 37824202 ± 1% -75.1% 9405078 ± 4% cpuidle.C6-IVT.usage > 1.015e+08 ± 1% -73.6% 26767355 ± 3% time.voluntary_context_switches > 13793 ± 1% -73.9% 3603 ± 5% time.system_time > 5971084 ± 1% -73.6% 1574912 ± 5% softirqs.SCHED > 10539492 ± 3% -72.0% 2956258 ± 6% cpuidle.C1E-IVT.usage > 2 ± 0% +230.0% 6 ± 12% vmstat.procs.b > 14064 ± 1% -71.2% 4049 ± 6% softirqs.HRTIMER > 7388306 ± 1% -71.2% 2129929 ± 4% softirqs.TIMER > 3.496e+09 ± 1% -70.3% 1.04e+09 ± 1% cpuidle.C3-IVT.time > 0.88 ± 6% +224.9% 2.87 ± 11% turbostat.Pkg%pc6 > 19969464 ± 2% -66.2% 6750675 ± 5% cpuidle.C1-IVT.usage > 78473 ± 6% -64.3% 28016 ± 7% time.involuntary_context_switches > 4.23 ± 5% +181.4% 11.90 ± 3% turbostat.Pkg%pc2 > 2.551e+09 ± 1% -61.4% 9.837e+08 ± 3% cpuidle.C1E-IVT.time > 8084 ± 3% +142.6% 19608 ± 3% meminfo.Writeback > 2026 ± 4% +141.6% 4895 ± 4% proc-vmstat.nr_writeback > 165 ± 4% -56.9% 71 ± 14% numa-vmstat.node1.nr_inactive_anon > 7.748e+09 ± 3% -50.3% 3.852e+09 ± 3% cpuidle.C1-IVT.time > 175 ± 5% -53.2% 82 ± 13% numa-vmstat.node1.nr_shmem > 1115 ± 0% -50.3% 554 ± 1% time.elapsed_time.max > 1115 ± 0% -50.3% 554 ± 1% time.elapsed_time > 1147 ± 0% -49.0% 585 ± 1% uptime.boot > 2260889 ± 0% -48.8% 1157272 ± 1% proc-vmstat.pgfree > 16805 ± 2% -35.9% 10776 ± 23% numa-vmstat.node1.nr_dirty > 1235 ± 2% -47.5% 649 ± 3% time.percent_of_cpu_this_job_got > 67245 ± 2% -35.9% 43122 ± 23% numa-meminfo.node1.Dirty > 39041 ± 0% -45.7% 21212 ± 2% uptime.idle > 13 ± 9% -49.0% 6 ± 11% vmstat.procs.r > 3072 ± 10% -40.3% 1833 ± 9% cpuidle.POLL.usage > 3045115 ± 0% -46.1% 1642053 ± 1% proc-vmstat.pgfault > 202 ± 1% -45.2% 110 ± 0% proc-vmstat.nr_inactive_anon > 4583079 ± 2% -31.4% 3143602 ± 16% numa-vmstat.node1.numa_hit > 28.03 ± 0% +69.1% 47.39 ± 1% turbostat.CPU%c6 > 223 ± 1% -41.1% 131 ± 1% proc-vmstat.nr_shmem > 4518820 ± 3% -30.8% 3128304 ± 16% numa-vmstat.node1.numa_local > 3363496 ± 3% -27.4% 2441619 ± 20% numa-vmstat.node1.nr_dirtied > 3345346 ± 3% -27.4% 2428396 ± 20% numa-vmstat.node1.nr_written > 0.18 ± 18% +105.6% 0.37 ± 36% turbostat.Pkg%pc3 > 3427913 ± 3% -27.3% 2492563 ± 20% numa-vmstat.node1.nr_inactive_file > 13712431 ± 3% -27.3% 9971152 ± 20% numa-meminfo.node1.Inactive > 13711768 ± 3% -27.3% 9970866 ± 20% numa-meminfo.node1.Inactive(file) > 3444598 ± 3% -27.2% 2508920 ± 20% numa-vmstat.node1.nr_file_pages > 13778510 ± 3% -27.2% 10036287 ± 20% numa-meminfo.node1.FilePages > 8819175 ± 1% -28.3% 6320188 ± 19% numa-numastat.node1.numa_hit > 8819051 ± 1% -28.3% 6320152 ± 19% numa-numastat.node1.local_node > 14350918 ± 3% -26.8% 10504070 ± 19% numa-meminfo.node1.MemUsed > 100892 ± 3% -26.0% 74623 ± 19% numa-vmstat.node1.nr_slab_reclaimable > 403571 ± 3% -26.0% 298513 ± 19% numa-meminfo.node1.SReclaimable > 3525 ± 13% +36.6% 4817 ± 14% slabinfo.blkdev_requests.active_objs > 3552 ± 13% +36.3% 4841 ± 14% slabinfo.blkdev_requests.num_objs > 30779 ± 4% -34.7% 20084 ± 12% proc-vmstat.pgmigrate_success > 30779 ± 4% -34.7% 20084 ± 12% proc-vmstat.numa_pages_migrated > 447400 ± 2% -23.2% 343701 ± 16% numa-meminfo.node1.Slab > 2.532e+10 ± 0% -33.1% 1.694e+10 ± 1% cpuidle.C6-IVT.time > 3081 ± 9% +28.0% 3945 ± 12% slabinfo.mnt_cache.num_objs > 3026 ± 9% +28.8% 3898 ± 12% slabinfo.mnt_cache.active_objs > 5822 ± 4% +77.8% 10350 ± 25% numa-meminfo.node1.Writeback > 1454 ± 4% +77.3% 2579 ± 25% numa-vmstat.node1.nr_writeback > 424984 ± 1% -26.5% 312255 ± 3% proc-vmstat.numa_pte_updates > 368001 ± 1% -26.8% 269440 ± 3% proc-vmstat.numa_hint_faults > 456465 ± 1% -26.7% 334594 ± 4% time.minor_page_faults > 3.86 ± 3% -24.4% 2.92 ± 2% turbostat.CPU%c3 > 4661151 ± 2% +20.6% 5622999 ± 9% numa-vmstat.node1.nr_free_pages > 18644452 ± 2% +20.6% 22491300 ± 9% numa-meminfo.node1.MemFree > 876 ± 2% +28.2% 1124 ± 5% slabinfo.kmalloc-4096.num_objs > 858 ± 3% +24.0% 1064 ± 5% slabinfo.kmalloc-4096.active_objs > 17767832 ± 8% -25.4% 13249545 ± 17% cpuidle.POLL.time > 285093 ± 1% -23.1% 219372 ± 5% proc-vmstat.numa_hint_faults_local > 105423 ± 2% -16.1% 88498 ± 0% meminfo.Dirty > 26365 ± 1% -16.0% 22152 ± 1% proc-vmstat.nr_dirty > 41.04 ± 1% -14.1% 35.26 ± 1% turbostat.CPU%c1 > 9385 ± 4% -14.3% 8043 ± 6% slabinfo.kmalloc-192.active_objs > 9574 ± 3% -13.9% 8241 ± 6% slabinfo.kmalloc-192.num_objs > 2411 ± 3% +17.0% 2820 ± 4% slabinfo.kmalloc-2048.active_objs > 12595574 ± 0% -10.0% 11338368 ± 1% proc-vmstat.pgalloc_normal > 5262 ± 1% +13.3% 5962 ± 1% slabinfo.kmalloc-1024.num_objs > 5262 ± 1% +12.7% 5932 ± 1% slabinfo.kmalloc-1024.active_objs > 2538 ± 3% +13.7% 2885 ± 4% slabinfo.kmalloc-2048.num_objs > 5299546 ± 0% -9.9% 4776351 ± 0% slabinfo.buffer_head.active_objs > 5299546 ± 0% -9.9% 4776351 ± 0% slabinfo.buffer_head.num_objs > 135885 ± 0% -9.9% 122470 ± 0% slabinfo.buffer_head.num_slabs > 135885 ± 0% -9.9% 122470 ± 0% slabinfo.buffer_head.active_slabs > 28.04 ± 2% +715.6% 228.69 ± 3% iostat.sdb.avgrq-sz > 28.05 ± 2% +708.1% 226.72 ± 2% iostat.sdc.avgrq-sz > 2245 ± 3% -81.6% 413 ± 1% iostat.sda.w/s > 5.33 ± 1% +1008.2% 59.07 ± 1% iostat.sda.w_await > 5.85 ± 1% +1126.4% 71.69 ± 4% iostat.sda.r_await > 5.36 ± 1% +978.6% 57.79 ± 3% iostat.sdc.w_await > 1263 ± 4% -85.8% 179 ± 6% iostat.sdc.r/s > 2257 ± 3% -81.6% 414 ± 2% iostat.sdb.w/s > 1264 ± 4% -85.8% 179 ± 6% iostat.sdb.r/s > 5.55 ± 0% +1024.2% 62.37 ± 4% iostat.sdb.await > 5.89 ± 1% +1125.9% 72.16 ± 6% iostat.sdb.r_await > 5.36 ± 0% +1014.3% 59.75 ± 3% iostat.sdb.w_await > 5.57 ± 1% +987.9% 60.55 ± 3% iostat.sdc.await > 5.51 ± 0% +1017.3% 61.58 ± 1% iostat.sda.await > 1264 ± 4% -85.8% 179 ± 6% iostat.sda.r/s > 28.09 ± 2% +714.2% 228.73 ± 2% iostat.sda.avgrq-sz > 5.95 ± 2% +1091.0% 70.82 ± 6% iostat.sdc.r_await > 2252 ± 3% -81.5% 417 ± 2% iostat.sdc.w/s > 4032 ± 2% +151.6% 10143 ± 1% iostat.sdb.wrqm/s > 4043 ± 2% +151.0% 10150 ± 1% iostat.sda.wrqm/s > 4035 ± 2% +151.2% 10138 ± 1% iostat.sdc.wrqm/s > 26252 ± 1% -54.0% 12077 ± 4% vmstat.system.in > 37813 ± 0% +101.0% 75998 ± 1% vmstat.io.bo > 37789 ± 0% +101.0% 75945 ± 1% iostat.md0.wkB/s > 205 ± 0% +96.1% 402 ± 1% iostat.md0.w/s > 164286 ± 1% -46.2% 88345 ± 2% vmstat.system.cs > 27.07 ± 2% -46.7% 14.42 ± 3% turbostat.%Busy > 810 ± 2% -46.7% 431 ± 3% turbostat.Avg_MHz > 15.56 ± 2% +71.7% 26.71 ± 1% iostat.sda.avgqu-sz > 15.65 ± 2% +69.1% 26.46 ± 2% iostat.sdc.avgqu-sz > 15.67 ± 2% +72.7% 27.06 ± 2% iostat.sdb.avgqu-sz > 25151 ± 0% +68.3% 42328 ± 1% iostat.sda.wkB/s > 25153 ± 0% +68.2% 42305 ± 1% iostat.sdb.wkB/s > 25149 ± 0% +68.2% 42292 ± 1% iostat.sdc.wkB/s > 97.45 ± 0% -21.1% 76.90 ± 0% turbostat.CorWatt > 12517 ± 0% -20.2% 9994 ± 1% iostat.sdc.rkB/s > 12517 ± 0% -20.0% 10007 ± 1% iostat.sda.rkB/s > 12512 ± 0% -19.9% 10018 ± 1% iostat.sdb.rkB/s > 1863 ± 3% +24.7% 2325 ± 1% iostat.sdb.rrqm/s > 1865 ± 3% +24.3% 2319 ± 1% iostat.sdc.rrqm/s > 1864 ± 3% +24.6% 2322 ± 1% iostat.sda.rrqm/s > 128 ± 0% -16.4% 107 ± 0% turbostat.PkgWatt > 150569 ± 0% -8.7% 137525 ± 0% iostat.md0.avgqu-sz > 4.29 ± 0% -5.1% 4.07 ± 0% turbostat.RAMWatt > > > more detailed changes about ivb44/fsmark/1x-64t-4BRD_12G-RAID5-btrfs-4M-30G-NoSync > --------- > > 26089f4902595a2f 4400755e356f9a2b0b7ceaa02f > ---------------- -------------------------- > %stddev %change %stddev > \ | \ > 273 ± 4% -18.1% 223 ± 6% fsmark.files_per_sec > 29.24 ± 1% +27.2% 37.20 ± 8% fsmark.time.elapsed_time.max > 29.24 ± 1% +27.2% 37.20 ± 8% fsmark.time.elapsed_time > 399 ± 4% -20.0% 319 ± 3% fsmark.time.percent_of_cpu_this_job_got > 129891 ± 20% -28.9% 92334 ± 15% fsmark.time.voluntary_context_switches > 266 ± 0% +413.4% 1365 ± 5% slabinfo.raid5-md0.num_objs > 266 ± 0% +413.4% 1365 ± 5% slabinfo.raid5-md0.active_objs > 0.23 ± 27% +98.6% 0.46 ± 35% turbostat.CPU%c3 > 56612063 ± 9% +36.7% 77369763 ± 20% cpuidle.C1-IVT.time > 5579498 ± 14% -36.0% 3571516 ± 6% cpuidle.C1E-IVT.time > 4668 ± 38% +64.7% 7690 ± 19% numa-vmstat.node0.nr_unevictable > 18674 ± 38% +64.7% 30762 ± 19% numa-meminfo.node0.Unevictable > 9298 ± 37% +64.4% 15286 ± 19% proc-vmstat.nr_unevictable > 4629 ± 37% +64.1% 7596 ± 19% numa-vmstat.node1.nr_unevictable > 18535 ± 37% +63.9% 30385 ± 19% numa-meminfo.node1.Unevictable > 4270894 ± 19% +65.6% 7070923 ± 21% cpuidle.C3-IVT.time > 38457 ± 37% +59.0% 61148 ± 19% meminfo.Unevictable > 3748226 ± 17% +26.6% 4743674 ± 16% numa-vmstat.node0.numa_local > 4495283 ± 13% -24.8% 3382315 ± 17% numa-vmstat.node0.nr_free_pages > 3818432 ± 16% +26.5% 4830938 ± 16% numa-vmstat.node0.numa_hit > 17966826 ± 13% -24.7% 13537228 ± 17% numa-meminfo.node0.MemFree > 14901309 ± 15% +29.7% 19330906 ± 12% numa-meminfo.node0.MemUsed > 26 ± 21% -32.9% 17 ± 14% cpuidle.POLL.usage > 1.183e+09 ± 1% +29.6% 1.533e+09 ± 8% cpuidle.C6-IVT.time > 29.24 ± 1% +27.2% 37.20 ± 8% time.elapsed_time > 29.24 ± 1% +27.2% 37.20 ± 8% time.elapsed_time.max > 399 ± 4% -20.0% 319 ± 3% time.percent_of_cpu_this_job_got > 850 ± 4% -8.6% 777 ± 5% slabinfo.blkdev_requests.num_objs > 850 ± 4% -8.6% 777 ± 5% slabinfo.blkdev_requests.active_objs > 14986 ± 9% +17.1% 17548 ± 8% numa-vmstat.node0.nr_slab_reclaimable > 11943 ± 5% -12.6% 10441 ± 2% slabinfo.kmalloc-192.num_objs > 59986 ± 9% +17.0% 70186 ± 8% numa-meminfo.node0.SReclaimable > 3703 ± 6% +10.2% 4082 ± 7% slabinfo.btrfs_delayed_data_ref.num_objs > 133551 ± 6% +16.1% 154995 ± 1% proc-vmstat.pgfault > 129891 ± 20% -28.9% 92334 ± 15% time.voluntary_context_switches > 11823 ± 4% -12.0% 10409 ± 3% slabinfo.kmalloc-192.active_objs > 3703 ± 6% +9.7% 4061 ± 7% slabinfo.btrfs_delayed_data_ref.active_objs > 19761 ± 2% -11.2% 17542 ± 6% slabinfo.anon_vma.active_objs > 19761 ± 2% -11.2% 17544 ± 6% slabinfo.anon_vma.num_objs > 13002 ± 3% +14.9% 14944 ± 5% slabinfo.kmalloc-256.num_objs > 12695 ± 3% +13.8% 14446 ± 7% slabinfo.kmalloc-256.active_objs > 1190 ± 1% -11.8% 1050 ± 3% slabinfo.mnt_cache.num_objs > 1190 ± 1% -11.8% 1050 ± 3% slabinfo.mnt_cache.active_objs > 136862 ± 1% -13.8% 117938 ± 7% cpuidle.C6-IVT.usage > 1692630 ± 3% +12.3% 1900854 ± 0% numa-vmstat.node0.nr_written > 1056 ± 2% +8.8% 1149 ± 3% slabinfo.mm_struct.active_objs > 1056 ± 2% +8.8% 1149 ± 3% slabinfo.mm_struct.num_objs > 24029 ± 11% -30.6% 16673 ± 8% vmstat.system.cs > 8859 ± 2% -15.0% 7530 ± 8% vmstat.system.in > 905630 ± 2% -16.8% 753097 ± 4% iostat.md0.wkB/s > 906433 ± 2% -16.9% 753482 ± 4% vmstat.io.bo > 3591 ± 2% -16.9% 2982 ± 4% iostat.md0.w/s > 13.22 ± 5% -16.3% 11.07 ± 1% turbostat.%Busy > 402 ± 4% -15.9% 338 ± 1% turbostat.Avg_MHz > 54236 ± 3% +10.4% 59889 ± 4% iostat.md0.avgqu-sz > 7.67 ± 1% +4.5% 8.01 ± 1% turbostat.RAMWatt > > > > --yliu > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 811 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: performance changes on 4400755e: 200.0% fsmark.files_per_sec, -18.1% fsmark.files_per_sec, and few more 2015-03-25 3:03 ` NeilBrown @ 2015-03-26 4:30 ` Yuanhan Liu -1 siblings, 0 replies; 6+ messages in thread From: Yuanhan Liu @ 2015-03-26 4:30 UTC (permalink / raw) To: lkp [-- Attachment #1: Type: text/plain, Size: 27959 bytes --] On Wed, Mar 25, 2015 at 02:03:59PM +1100, NeilBrown wrote: > On Wed, 18 Mar 2015 13:00:30 +0800 Yuanahn Liu <yuanhan.liu@linux.intel.com> > wrote: > > > Hi, > > > > FYI, we noticed performance changes on `fsmark.files_per_sec' by 4400755e356f9a2b0b7ceaa02f57b1c7546c3765: > > > > > commit 4400755e356f9a2b0b7ceaa02f57b1c7546c3765 > > > Author: NeilBrown <neilb@suse.de> > > > AuthorDate: Thu Feb 26 12:47:56 2015 +1100 > > > Commit: NeilBrown <neilb@suse.de> > > > CommitDate: Wed Mar 4 13:40:19 2015 +1100 > > > > > > md/raid5: allow the stripe_cache to grow and shrink. > > Thanks a lot for this testing!!! I was wondering how I could do some proper > testing of this patch, and you've done it for me :-) Welcome! > > The large number of improvements is very encouraging - that is what I was > hoping for of course. > > The few regressions could be a concern. I note that are all NoSync. > That seems to suggest that they could just be writing more data. It's not a time based test, but size based test: > > 40G, 30G, 120G: means the total test size Hence, I doubt it might be writing more data. > i.e. the data is written a bit earlier (certainly possible) so it happen to > introduce more delay .... > > I guess I'm not really sure how to interpret NoSync results, and suspect that > poor NoSync result don't really reflect much on the underlying block device. > Could that be right? Sorry, I'm not quite sure I followed you. Poor NoSync result? Do you mean the small number like 63.133, 57.600? They are of unit of files_per_sec, and file size is 4M. Hence, it would be 200+ MB/s, which is not that bad in this case, as it's a 3 hard disk RAID5. > > 3 8.1 63.133 ±0.5% 3 9.2 55.633 ±0.2% -11.9% ivb44/fsmark/1x-1t-3HDD-RAID5-btrfs-4M-120G-NoSync Here are few iostat sample from 26089f4902595a2f64c512066af07af6e82eb096 of above test: avg-cpu: %user %nice %system %iowait %steal %idle 0.00 0.00 0.63 1.67 0.00 97.70 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdb 0.00 30353.00 0.00 240.00 0.00 121860.00 1015.50 1.29 5.35 0.00 5.35 3.50 83.90 sdc 0.00 30353.00 0.00 241.00 0.00 122372.00 1015.54 0.66 2.74 0.00 2.74 2.53 60.90 sda 0.00 30353.00 0.00 242.00 0.00 122884.00 1015.57 1.29 5.36 0.00 5.36 3.52 85.20 md0 0.00 0.00 0.00 956.00 0.00 244736.00 512.00 227231.39 0.00 0.00 0.00 1.05 100.00 avg-cpu: %user %nice %system %iowait %steal %idle 0.02 0.00 0.69 1.69 0.00 97.60 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdb 0.00 30988.00 0.00 247.00 0.00 125444.00 1015.74 1.77 7.17 0.00 7.17 4.02 99.40 sdc 0.00 30988.00 0.00 245.00 0.00 124420.00 1015.67 1.19 4.82 0.00 4.82 3.67 89.90 sda 0.00 30988.00 0.00 247.00 0.00 125444.00 1015.74 0.65 2.65 0.00 2.65 2.54 62.70 md0 0.00 0.00 0.00 976.00 0.00 249856.00 512.00 228206.37 0.00 0.00 0.00 1.02 100.00 avg-cpu: %user %nice %system %iowait %steal %idle 0.00 0.00 0.61 1.67 0.00 97.72 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdb 0.00 29718.00 0.00 235.00 0.00 119300.00 1015.32 1.35 5.71 0.00 5.71 3.71 87.20 sdc 0.00 29718.00 0.00 236.00 0.00 119812.00 1015.36 1.19 5.06 0.00 5.06 3.43 80.90 sda 0.00 29718.00 0.00 235.00 0.00 119300.00 1015.32 0.87 3.69 0.00 3.69 2.99 70.20 md0 0.00 0.00 0.00 936.00 0.00 239616.00 512.00 229157.33 0.00 0.00 0.00 1.07 100.00 And few iostat sample of 4400755e356f9a2b0b7ceaa02f57b1c7546c3765(first bad commit): avg-cpu: %user %nice %system %iowait %steal %idle 0.02 0.00 1.09 1.54 0.00 97.35 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdb 1.00 27677.00 1.00 206.00 8.00 100516.00 971.25 27.40 130.56 196.00 130.24 4.72 97.70 sdc 0.00 27677.00 0.00 207.00 0.00 101028.00 976.12 27.05 129.43 0.00 129.43 4.61 95.50 sda 5.00 27677.00 1.00 211.00 16.00 102984.00 971.70 26.61 127.00 201.00 126.64 4.50 95.50 md0 0.00 0.00 0.00 824.00 0.00 210944.00 512.00 224122.02 0.00 0.00 0.00 1.21 100.00 avg-cpu: %user %nice %system %iowait %steal %idle 0.00 0.00 0.98 1.54 0.00 97.47 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdb 3.00 21203.00 1.00 218.00 16.00 107060.00 977.86 30.44 147.77 198.00 147.54 4.53 99.10 sdc 2.00 21203.00 2.00 220.00 16.00 108592.00 978.45 31.12 150.65 208.00 150.13 4.43 98.40 sda 0.00 21203.00 1.00 220.00 24.00 108020.00 977.77 30.56 150.88 197.00 150.67 4.38 96.80 md0 0.00 0.00 0.00 720.00 0.00 184320.00 512.00 224963.92 0.00 0.00 0.00 1.39 100.00 avg-cpu: %user %nice %system %iowait %steal %idle 0.02 0.00 0.96 1.63 0.00 97.39 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdb 11.00 29455.00 3.00 213.00 56.00 102958.00 953.83 31.19 134.97 205.00 133.99 4.56 98.40 sdc 0.00 29454.00 0.00 210.00 0.00 99890.00 951.33 29.36 127.07 0.00 127.07 4.36 91.60 sda 1.00 29454.00 0.00 215.00 0.00 103534.00 963.11 27.54 117.54 0.00 117.54 4.26 91.60 md0 0.00 0.00 0.00 876.00 0.00 224256.00 512.00 225993.60 0.00 0.00 0.00 1.14 100.10 > > 3 0.5 273.100 ±4.3% 3 0.6 223.567 ±6.5% -18.1% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-btrfs-4M-30G-NoSync > > 3 8.1 63.133 ±0.5% 3 9.2 55.633 ±0.2% -11.9% ivb44/fsmark/1x-1t-3HDD-RAID5-btrfs-4M-120G-NoSync > > 3 8.2 64.000 ±0.0% 3 9.2 57.600 ±0.0% -10.0% ivb44/fsmark/1x-64t-3HDD-RAID5-btrfs-4M-120G-NoSync > > Also, I'm a little confused by the > fsmark.time.involuntary_context_switches > statistic: > > > 1235 ± 2% -47.5% 649 ± 3% fsmark.time.percent_of_cpu_this_job_got > > > 399 ± 4% -20.0% 319 ± 3% fsmark.time.percent_of_cpu_this_job_got > > Does that means that the ext4 test changed from 12.4 cpus to 6.4, and that > the btrfs test chnages from 4 cpus to 3.2 ??? fsmark.time.percent_of_cpu_this_job_got is output from /usr/bin/time, which is from gnu time package. There is the explanation from source code: * P == percent of CPU this job got (total cpu time / elapsed time) --yliu > > Or does it just not mean anything? > > Thanks, > NeilBrown > > > > > > > > 26089f4902595a2f64c512066af07af6e82eb096 4400755e356f9a2b0b7ceaa02f57b1c7546c3765 > > ---------------------------------------- ---------------------------------------- > > run time(m) metric_value ±stddev run time(m) metric_value ±stddev change testbox/benchmark/sub-testcase > > --- ------ ---------------------------- --- ------ ---------------------------- -------- ------------------------------ > > 3 18.6 6.400 ±0.0% 5 9.2 19.200 ±0.0% 200.0% ivb44/fsmark/1x-64t-3HDD-RAID5-ext4-4M-40G-fsyncBeforeClose > > 3 24.7 6.400 ±0.0% 3 13.7 12.800 ±0.0% 100.0% ivb44/fsmark/1x-64t-3HDD-RAID5-f2fs-4M-40G-fsyncBeforeClose > > 3 17.5 28.267 ±9.6% 3 12.3 42.833 ±6.5% 51.5% ivb44/fsmark/1x-64t-3HDD-RAID5-f2fs-4M-120G-NoSync > > 3 16.7 30.700 ±1.5% 3 12.6 40.733 ±2.4% 32.7% ivb44/fsmark/1x-1t-3HDD-RAID5-f2fs-4M-120G-NoSync > > 3 29.0 5.867 ±0.8% 5 23.6 7.240 ±0.7% 23.4% ivb44/fsmark/1x-1t-3HDD-RAID5-btrfs-4M-40G-fsyncBeforeClose > > 3 28.5 6.000 ±0.0% 3 23.2 7.367 ±0.6% 22.8% ivb44/fsmark/1x-1t-3HDD-RAID5-f2fs-4M-40G-fsyncBeforeClose > > 5 11.7 14.600 ±0.0% 5 9.7 17.500 ±0.4% 19.9% ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-40G-fsyncBeforeClose > > 3 22.4 25.600 ±0.0% 5 17.9 30.120 ±4.1% 17.7% ivb44/fsmark/1x-64t-3HDD-RAID5-xfs-4M-120G-NoSync > > 5 10.8 47.320 ±0.6% 5 9.3 54.820 ±0.2% 15.8% ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-120G-NoSync > > 1 0.5 252.400 ±0.0% 1 0.5 263.300 ±0.0% 4.3% ivb44/fsmark/1x-1t-4BRD_12G-RAID5-ext4-4M-30G-NoSync > > > > 3 0.5 273.100 ±4.3% 3 0.6 223.567 ±6.5% -18.1% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-btrfs-4M-30G-NoSync > > 3 8.1 63.133 ±0.5% 3 9.2 55.633 ±0.2% -11.9% ivb44/fsmark/1x-1t-3HDD-RAID5-btrfs-4M-120G-NoSync > > 3 8.2 64.000 ±0.0% 3 9.2 57.600 ±0.0% -10.0% ivb44/fsmark/1x-64t-3HDD-RAID5-btrfs-4M-120G-NoSync > > > > > > NOTE: here are some more info about those test parameters for you to > > understand the testcase better: > > > > 1x: where 'x' means iterations or loop, corresponding to the 'L' option of fsmark > > 1t, 64t: where 't' means thread > > 4M: means the single file size, corresponding to the '-s' option of fsmark > > 40G, 30G, 120G: means the total test size > > > > 4BRD_12G: BRD is the ramdisk, where '4' means 4 ramdisk, and where '12G' means > > the size of one ramdisk. So, it would be 48G in total. And we made a > > raid on those ramdisk. > > > > > > As you can see from above data, interestingly, all performance > > regressions come from btrfs testing. That's why Chris is also > > in the cc list, with which just FYI. > > > > > > FYI, here I listed more detailed changes for the maximal postive and negtive changes. > > > > more detailed changes about ivb44/fsmark/1x-64t-3HDD-RAID5-ext4-4M-40G-fsyncBeforeClose > > --------- > > > > 26089f4902595a2f 4400755e356f9a2b0b7ceaa02f > > ---------------- -------------------------- > > %stddev %change %stddev > > \ | \ > > 6.40 ± 0% +200.0% 19.20 ± 0% fsmark.files_per_sec > > 1.015e+08 ± 1% -73.6% 26767355 ± 3% fsmark.time.voluntary_context_switches > > 13793 ± 1% -73.9% 3603 ± 5% fsmark.time.system_time > > 78473 ± 6% -64.3% 28016 ± 7% fsmark.time.involuntary_context_switches > > 15789555 ± 9% -54.7% 7159485 ± 13% fsmark.app_overhead > > 1115 ± 0% -50.3% 554 ± 1% fsmark.time.elapsed_time.max > > 1115 ± 0% -50.3% 554 ± 1% fsmark.time.elapsed_time > > 1235 ± 2% -47.5% 649 ± 3% fsmark.time.percent_of_cpu_this_job_got > > 456465 ± 1% -26.7% 334594 ± 4% fsmark.time.minor_page_faults > > 275 ± 0% +1257.7% 3733 ± 2% slabinfo.raid5-md0.num_objs > > 275 ± 0% +1257.7% 3733 ± 2% slabinfo.raid5-md0.active_objs > > 11 ± 0% +1250.9% 148 ± 2% slabinfo.raid5-md0.active_slabs > > 11 ± 0% +1250.9% 148 ± 2% slabinfo.raid5-md0.num_slabs > > 2407 ± 4% +293.4% 9471 ± 26% numa-meminfo.node0.Writeback > > 600 ± 4% +294.9% 2372 ± 26% numa-vmstat.node0.nr_writeback > > 1114505 ± 0% -77.4% 251696 ± 2% softirqs.TASKLET > > 1808027 ± 1% -77.7% 402378 ± 4% softirqs.RCU > > 12158665 ± 1% -77.1% 2786069 ± 4% cpuidle.C3-IVT.usage > > 1119433 ± 0% -77.3% 254192 ± 2% softirqs.BLOCK > > 37824202 ± 1% -75.1% 9405078 ± 4% cpuidle.C6-IVT.usage > > 1.015e+08 ± 1% -73.6% 26767355 ± 3% time.voluntary_context_switches > > 13793 ± 1% -73.9% 3603 ± 5% time.system_time > > 5971084 ± 1% -73.6% 1574912 ± 5% softirqs.SCHED > > 10539492 ± 3% -72.0% 2956258 ± 6% cpuidle.C1E-IVT.usage > > 2 ± 0% +230.0% 6 ± 12% vmstat.procs.b > > 14064 ± 1% -71.2% 4049 ± 6% softirqs.HRTIMER > > 7388306 ± 1% -71.2% 2129929 ± 4% softirqs.TIMER > > 3.496e+09 ± 1% -70.3% 1.04e+09 ± 1% cpuidle.C3-IVT.time > > 0.88 ± 6% +224.9% 2.87 ± 11% turbostat.Pkg%pc6 > > 19969464 ± 2% -66.2% 6750675 ± 5% cpuidle.C1-IVT.usage > > 78473 ± 6% -64.3% 28016 ± 7% time.involuntary_context_switches > > 4.23 ± 5% +181.4% 11.90 ± 3% turbostat.Pkg%pc2 > > 2.551e+09 ± 1% -61.4% 9.837e+08 ± 3% cpuidle.C1E-IVT.time > > 8084 ± 3% +142.6% 19608 ± 3% meminfo.Writeback > > 2026 ± 4% +141.6% 4895 ± 4% proc-vmstat.nr_writeback > > 165 ± 4% -56.9% 71 ± 14% numa-vmstat.node1.nr_inactive_anon > > 7.748e+09 ± 3% -50.3% 3.852e+09 ± 3% cpuidle.C1-IVT.time > > 175 ± 5% -53.2% 82 ± 13% numa-vmstat.node1.nr_shmem > > 1115 ± 0% -50.3% 554 ± 1% time.elapsed_time.max > > 1115 ± 0% -50.3% 554 ± 1% time.elapsed_time > > 1147 ± 0% -49.0% 585 ± 1% uptime.boot > > 2260889 ± 0% -48.8% 1157272 ± 1% proc-vmstat.pgfree > > 16805 ± 2% -35.9% 10776 ± 23% numa-vmstat.node1.nr_dirty > > 1235 ± 2% -47.5% 649 ± 3% time.percent_of_cpu_this_job_got > > 67245 ± 2% -35.9% 43122 ± 23% numa-meminfo.node1.Dirty > > 39041 ± 0% -45.7% 21212 ± 2% uptime.idle > > 13 ± 9% -49.0% 6 ± 11% vmstat.procs.r > > 3072 ± 10% -40.3% 1833 ± 9% cpuidle.POLL.usage > > 3045115 ± 0% -46.1% 1642053 ± 1% proc-vmstat.pgfault > > 202 ± 1% -45.2% 110 ± 0% proc-vmstat.nr_inactive_anon > > 4583079 ± 2% -31.4% 3143602 ± 16% numa-vmstat.node1.numa_hit > > 28.03 ± 0% +69.1% 47.39 ± 1% turbostat.CPU%c6 > > 223 ± 1% -41.1% 131 ± 1% proc-vmstat.nr_shmem > > 4518820 ± 3% -30.8% 3128304 ± 16% numa-vmstat.node1.numa_local > > 3363496 ± 3% -27.4% 2441619 ± 20% numa-vmstat.node1.nr_dirtied > > 3345346 ± 3% -27.4% 2428396 ± 20% numa-vmstat.node1.nr_written > > 0.18 ± 18% +105.6% 0.37 ± 36% turbostat.Pkg%pc3 > > 3427913 ± 3% -27.3% 2492563 ± 20% numa-vmstat.node1.nr_inactive_file > > 13712431 ± 3% -27.3% 9971152 ± 20% numa-meminfo.node1.Inactive > > 13711768 ± 3% -27.3% 9970866 ± 20% numa-meminfo.node1.Inactive(file) > > 3444598 ± 3% -27.2% 2508920 ± 20% numa-vmstat.node1.nr_file_pages > > 13778510 ± 3% -27.2% 10036287 ± 20% numa-meminfo.node1.FilePages > > 8819175 ± 1% -28.3% 6320188 ± 19% numa-numastat.node1.numa_hit > > 8819051 ± 1% -28.3% 6320152 ± 19% numa-numastat.node1.local_node > > 14350918 ± 3% -26.8% 10504070 ± 19% numa-meminfo.node1.MemUsed > > 100892 ± 3% -26.0% 74623 ± 19% numa-vmstat.node1.nr_slab_reclaimable > > 403571 ± 3% -26.0% 298513 ± 19% numa-meminfo.node1.SReclaimable > > 3525 ± 13% +36.6% 4817 ± 14% slabinfo.blkdev_requests.active_objs > > 3552 ± 13% +36.3% 4841 ± 14% slabinfo.blkdev_requests.num_objs > > 30779 ± 4% -34.7% 20084 ± 12% proc-vmstat.pgmigrate_success > > 30779 ± 4% -34.7% 20084 ± 12% proc-vmstat.numa_pages_migrated > > 447400 ± 2% -23.2% 343701 ± 16% numa-meminfo.node1.Slab > > 2.532e+10 ± 0% -33.1% 1.694e+10 ± 1% cpuidle.C6-IVT.time > > 3081 ± 9% +28.0% 3945 ± 12% slabinfo.mnt_cache.num_objs > > 3026 ± 9% +28.8% 3898 ± 12% slabinfo.mnt_cache.active_objs > > 5822 ± 4% +77.8% 10350 ± 25% numa-meminfo.node1.Writeback > > 1454 ± 4% +77.3% 2579 ± 25% numa-vmstat.node1.nr_writeback > > 424984 ± 1% -26.5% 312255 ± 3% proc-vmstat.numa_pte_updates > > 368001 ± 1% -26.8% 269440 ± 3% proc-vmstat.numa_hint_faults > > 456465 ± 1% -26.7% 334594 ± 4% time.minor_page_faults > > 3.86 ± 3% -24.4% 2.92 ± 2% turbostat.CPU%c3 > > 4661151 ± 2% +20.6% 5622999 ± 9% numa-vmstat.node1.nr_free_pages > > 18644452 ± 2% +20.6% 22491300 ± 9% numa-meminfo.node1.MemFree > > 876 ± 2% +28.2% 1124 ± 5% slabinfo.kmalloc-4096.num_objs > > 858 ± 3% +24.0% 1064 ± 5% slabinfo.kmalloc-4096.active_objs > > 17767832 ± 8% -25.4% 13249545 ± 17% cpuidle.POLL.time > > 285093 ± 1% -23.1% 219372 ± 5% proc-vmstat.numa_hint_faults_local > > 105423 ± 2% -16.1% 88498 ± 0% meminfo.Dirty > > 26365 ± 1% -16.0% 22152 ± 1% proc-vmstat.nr_dirty > > 41.04 ± 1% -14.1% 35.26 ± 1% turbostat.CPU%c1 > > 9385 ± 4% -14.3% 8043 ± 6% slabinfo.kmalloc-192.active_objs > > 9574 ± 3% -13.9% 8241 ± 6% slabinfo.kmalloc-192.num_objs > > 2411 ± 3% +17.0% 2820 ± 4% slabinfo.kmalloc-2048.active_objs > > 12595574 ± 0% -10.0% 11338368 ± 1% proc-vmstat.pgalloc_normal > > 5262 ± 1% +13.3% 5962 ± 1% slabinfo.kmalloc-1024.num_objs > > 5262 ± 1% +12.7% 5932 ± 1% slabinfo.kmalloc-1024.active_objs > > 2538 ± 3% +13.7% 2885 ± 4% slabinfo.kmalloc-2048.num_objs > > 5299546 ± 0% -9.9% 4776351 ± 0% slabinfo.buffer_head.active_objs > > 5299546 ± 0% -9.9% 4776351 ± 0% slabinfo.buffer_head.num_objs > > 135885 ± 0% -9.9% 122470 ± 0% slabinfo.buffer_head.num_slabs > > 135885 ± 0% -9.9% 122470 ± 0% slabinfo.buffer_head.active_slabs > > 28.04 ± 2% +715.6% 228.69 ± 3% iostat.sdb.avgrq-sz > > 28.05 ± 2% +708.1% 226.72 ± 2% iostat.sdc.avgrq-sz > > 2245 ± 3% -81.6% 413 ± 1% iostat.sda.w/s > > 5.33 ± 1% +1008.2% 59.07 ± 1% iostat.sda.w_await > > 5.85 ± 1% +1126.4% 71.69 ± 4% iostat.sda.r_await > > 5.36 ± 1% +978.6% 57.79 ± 3% iostat.sdc.w_await > > 1263 ± 4% -85.8% 179 ± 6% iostat.sdc.r/s > > 2257 ± 3% -81.6% 414 ± 2% iostat.sdb.w/s > > 1264 ± 4% -85.8% 179 ± 6% iostat.sdb.r/s > > 5.55 ± 0% +1024.2% 62.37 ± 4% iostat.sdb.await > > 5.89 ± 1% +1125.9% 72.16 ± 6% iostat.sdb.r_await > > 5.36 ± 0% +1014.3% 59.75 ± 3% iostat.sdb.w_await > > 5.57 ± 1% +987.9% 60.55 ± 3% iostat.sdc.await > > 5.51 ± 0% +1017.3% 61.58 ± 1% iostat.sda.await > > 1264 ± 4% -85.8% 179 ± 6% iostat.sda.r/s > > 28.09 ± 2% +714.2% 228.73 ± 2% iostat.sda.avgrq-sz > > 5.95 ± 2% +1091.0% 70.82 ± 6% iostat.sdc.r_await > > 2252 ± 3% -81.5% 417 ± 2% iostat.sdc.w/s > > 4032 ± 2% +151.6% 10143 ± 1% iostat.sdb.wrqm/s > > 4043 ± 2% +151.0% 10150 ± 1% iostat.sda.wrqm/s > > 4035 ± 2% +151.2% 10138 ± 1% iostat.sdc.wrqm/s > > 26252 ± 1% -54.0% 12077 ± 4% vmstat.system.in > > 37813 ± 0% +101.0% 75998 ± 1% vmstat.io.bo > > 37789 ± 0% +101.0% 75945 ± 1% iostat.md0.wkB/s > > 205 ± 0% +96.1% 402 ± 1% iostat.md0.w/s > > 164286 ± 1% -46.2% 88345 ± 2% vmstat.system.cs > > 27.07 ± 2% -46.7% 14.42 ± 3% turbostat.%Busy > > 810 ± 2% -46.7% 431 ± 3% turbostat.Avg_MHz > > 15.56 ± 2% +71.7% 26.71 ± 1% iostat.sda.avgqu-sz > > 15.65 ± 2% +69.1% 26.46 ± 2% iostat.sdc.avgqu-sz > > 15.67 ± 2% +72.7% 27.06 ± 2% iostat.sdb.avgqu-sz > > 25151 ± 0% +68.3% 42328 ± 1% iostat.sda.wkB/s > > 25153 ± 0% +68.2% 42305 ± 1% iostat.sdb.wkB/s > > 25149 ± 0% +68.2% 42292 ± 1% iostat.sdc.wkB/s > > 97.45 ± 0% -21.1% 76.90 ± 0% turbostat.CorWatt > > 12517 ± 0% -20.2% 9994 ± 1% iostat.sdc.rkB/s > > 12517 ± 0% -20.0% 10007 ± 1% iostat.sda.rkB/s > > 12512 ± 0% -19.9% 10018 ± 1% iostat.sdb.rkB/s > > 1863 ± 3% +24.7% 2325 ± 1% iostat.sdb.rrqm/s > > 1865 ± 3% +24.3% 2319 ± 1% iostat.sdc.rrqm/s > > 1864 ± 3% +24.6% 2322 ± 1% iostat.sda.rrqm/s > > 128 ± 0% -16.4% 107 ± 0% turbostat.PkgWatt > > 150569 ± 0% -8.7% 137525 ± 0% iostat.md0.avgqu-sz > > 4.29 ± 0% -5.1% 4.07 ± 0% turbostat.RAMWatt > > > > > > more detailed changes about ivb44/fsmark/1x-64t-4BRD_12G-RAID5-btrfs-4M-30G-NoSync > > --------- > > > > 26089f4902595a2f 4400755e356f9a2b0b7ceaa02f > > ---------------- -------------------------- > > %stddev %change %stddev > > \ | \ > > 273 ± 4% -18.1% 223 ± 6% fsmark.files_per_sec > > 29.24 ± 1% +27.2% 37.20 ± 8% fsmark.time.elapsed_time.max > > 29.24 ± 1% +27.2% 37.20 ± 8% fsmark.time.elapsed_time > > 399 ± 4% -20.0% 319 ± 3% fsmark.time.percent_of_cpu_this_job_got > > 129891 ± 20% -28.9% 92334 ± 15% fsmark.time.voluntary_context_switches > > 266 ± 0% +413.4% 1365 ± 5% slabinfo.raid5-md0.num_objs > > 266 ± 0% +413.4% 1365 ± 5% slabinfo.raid5-md0.active_objs > > 0.23 ± 27% +98.6% 0.46 ± 35% turbostat.CPU%c3 > > 56612063 ± 9% +36.7% 77369763 ± 20% cpuidle.C1-IVT.time > > 5579498 ± 14% -36.0% 3571516 ± 6% cpuidle.C1E-IVT.time > > 4668 ± 38% +64.7% 7690 ± 19% numa-vmstat.node0.nr_unevictable > > 18674 ± 38% +64.7% 30762 ± 19% numa-meminfo.node0.Unevictable > > 9298 ± 37% +64.4% 15286 ± 19% proc-vmstat.nr_unevictable > > 4629 ± 37% +64.1% 7596 ± 19% numa-vmstat.node1.nr_unevictable > > 18535 ± 37% +63.9% 30385 ± 19% numa-meminfo.node1.Unevictable > > 4270894 ± 19% +65.6% 7070923 ± 21% cpuidle.C3-IVT.time > > 38457 ± 37% +59.0% 61148 ± 19% meminfo.Unevictable > > 3748226 ± 17% +26.6% 4743674 ± 16% numa-vmstat.node0.numa_local > > 4495283 ± 13% -24.8% 3382315 ± 17% numa-vmstat.node0.nr_free_pages > > 3818432 ± 16% +26.5% 4830938 ± 16% numa-vmstat.node0.numa_hit > > 17966826 ± 13% -24.7% 13537228 ± 17% numa-meminfo.node0.MemFree > > 14901309 ± 15% +29.7% 19330906 ± 12% numa-meminfo.node0.MemUsed > > 26 ± 21% -32.9% 17 ± 14% cpuidle.POLL.usage > > 1.183e+09 ± 1% +29.6% 1.533e+09 ± 8% cpuidle.C6-IVT.time > > 29.24 ± 1% +27.2% 37.20 ± 8% time.elapsed_time > > 29.24 ± 1% +27.2% 37.20 ± 8% time.elapsed_time.max > > 399 ± 4% -20.0% 319 ± 3% time.percent_of_cpu_this_job_got > > 850 ± 4% -8.6% 777 ± 5% slabinfo.blkdev_requests.num_objs > > 850 ± 4% -8.6% 777 ± 5% slabinfo.blkdev_requests.active_objs > > 14986 ± 9% +17.1% 17548 ± 8% numa-vmstat.node0.nr_slab_reclaimable > > 11943 ± 5% -12.6% 10441 ± 2% slabinfo.kmalloc-192.num_objs > > 59986 ± 9% +17.0% 70186 ± 8% numa-meminfo.node0.SReclaimable > > 3703 ± 6% +10.2% 4082 ± 7% slabinfo.btrfs_delayed_data_ref.num_objs > > 133551 ± 6% +16.1% 154995 ± 1% proc-vmstat.pgfault > > 129891 ± 20% -28.9% 92334 ± 15% time.voluntary_context_switches > > 11823 ± 4% -12.0% 10409 ± 3% slabinfo.kmalloc-192.active_objs > > 3703 ± 6% +9.7% 4061 ± 7% slabinfo.btrfs_delayed_data_ref.active_objs > > 19761 ± 2% -11.2% 17542 ± 6% slabinfo.anon_vma.active_objs > > 19761 ± 2% -11.2% 17544 ± 6% slabinfo.anon_vma.num_objs > > 13002 ± 3% +14.9% 14944 ± 5% slabinfo.kmalloc-256.num_objs > > 12695 ± 3% +13.8% 14446 ± 7% slabinfo.kmalloc-256.active_objs > > 1190 ± 1% -11.8% 1050 ± 3% slabinfo.mnt_cache.num_objs > > 1190 ± 1% -11.8% 1050 ± 3% slabinfo.mnt_cache.active_objs > > 136862 ± 1% -13.8% 117938 ± 7% cpuidle.C6-IVT.usage > > 1692630 ± 3% +12.3% 1900854 ± 0% numa-vmstat.node0.nr_written > > 1056 ± 2% +8.8% 1149 ± 3% slabinfo.mm_struct.active_objs > > 1056 ± 2% +8.8% 1149 ± 3% slabinfo.mm_struct.num_objs > > 24029 ± 11% -30.6% 16673 ± 8% vmstat.system.cs > > 8859 ± 2% -15.0% 7530 ± 8% vmstat.system.in > > 905630 ± 2% -16.8% 753097 ± 4% iostat.md0.wkB/s > > 906433 ± 2% -16.9% 753482 ± 4% vmstat.io.bo > > 3591 ± 2% -16.9% 2982 ± 4% iostat.md0.w/s > > 13.22 ± 5% -16.3% 11.07 ± 1% turbostat.%Busy > > 402 ± 4% -15.9% 338 ± 1% turbostat.Avg_MHz > > 54236 ± 3% +10.4% 59889 ± 4% iostat.md0.avgqu-sz > > 7.67 ± 1% +4.5% 8.01 ± 1% turbostat.RAMWatt > > > > > > > > --yliu > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > > the body of a message to majordomo(a)vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > Please read the FAQ at http://www.tux.org/lkml/ > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: performance changes on 4400755e: 200.0% fsmark.files_per_sec, -18.1% fsmark.files_per_sec, and few more @ 2015-03-26 4:30 ` Yuanhan Liu 0 siblings, 0 replies; 6+ messages in thread From: Yuanhan Liu @ 2015-03-26 4:30 UTC (permalink / raw) To: NeilBrown; +Cc: lkp, LKML, Chris Mason, Yuanhan Liu On Wed, Mar 25, 2015 at 02:03:59PM +1100, NeilBrown wrote: > On Wed, 18 Mar 2015 13:00:30 +0800 Yuanahn Liu <yuanhan.liu@linux.intel.com> > wrote: > > > Hi, > > > > FYI, we noticed performance changes on `fsmark.files_per_sec' by 4400755e356f9a2b0b7ceaa02f57b1c7546c3765: > > > > > commit 4400755e356f9a2b0b7ceaa02f57b1c7546c3765 > > > Author: NeilBrown <neilb@suse.de> > > > AuthorDate: Thu Feb 26 12:47:56 2015 +1100 > > > Commit: NeilBrown <neilb@suse.de> > > > CommitDate: Wed Mar 4 13:40:19 2015 +1100 > > > > > > md/raid5: allow the stripe_cache to grow and shrink. > > Thanks a lot for this testing!!! I was wondering how I could do some proper > testing of this patch, and you've done it for me :-) Welcome! > > The large number of improvements is very encouraging - that is what I was > hoping for of course. > > The few regressions could be a concern. I note that are all NoSync. > That seems to suggest that they could just be writing more data. It's not a time based test, but size based test: > > 40G, 30G, 120G: means the total test size Hence, I doubt it might be writing more data. > i.e. the data is written a bit earlier (certainly possible) so it happen to > introduce more delay .... > > I guess I'm not really sure how to interpret NoSync results, and suspect that > poor NoSync result don't really reflect much on the underlying block device. > Could that be right? Sorry, I'm not quite sure I followed you. Poor NoSync result? Do you mean the small number like 63.133, 57.600? They are of unit of files_per_sec, and file size is 4M. Hence, it would be 200+ MB/s, which is not that bad in this case, as it's a 3 hard disk RAID5. > > 3 8.1 63.133 ±0.5% 3 9.2 55.633 ±0.2% -11.9% ivb44/fsmark/1x-1t-3HDD-RAID5-btrfs-4M-120G-NoSync Here are few iostat sample from 26089f4902595a2f64c512066af07af6e82eb096 of above test: avg-cpu: %user %nice %system %iowait %steal %idle 0.00 0.00 0.63 1.67 0.00 97.70 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdb 0.00 30353.00 0.00 240.00 0.00 121860.00 1015.50 1.29 5.35 0.00 5.35 3.50 83.90 sdc 0.00 30353.00 0.00 241.00 0.00 122372.00 1015.54 0.66 2.74 0.00 2.74 2.53 60.90 sda 0.00 30353.00 0.00 242.00 0.00 122884.00 1015.57 1.29 5.36 0.00 5.36 3.52 85.20 md0 0.00 0.00 0.00 956.00 0.00 244736.00 512.00 227231.39 0.00 0.00 0.00 1.05 100.00 avg-cpu: %user %nice %system %iowait %steal %idle 0.02 0.00 0.69 1.69 0.00 97.60 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdb 0.00 30988.00 0.00 247.00 0.00 125444.00 1015.74 1.77 7.17 0.00 7.17 4.02 99.40 sdc 0.00 30988.00 0.00 245.00 0.00 124420.00 1015.67 1.19 4.82 0.00 4.82 3.67 89.90 sda 0.00 30988.00 0.00 247.00 0.00 125444.00 1015.74 0.65 2.65 0.00 2.65 2.54 62.70 md0 0.00 0.00 0.00 976.00 0.00 249856.00 512.00 228206.37 0.00 0.00 0.00 1.02 100.00 avg-cpu: %user %nice %system %iowait %steal %idle 0.00 0.00 0.61 1.67 0.00 97.72 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdb 0.00 29718.00 0.00 235.00 0.00 119300.00 1015.32 1.35 5.71 0.00 5.71 3.71 87.20 sdc 0.00 29718.00 0.00 236.00 0.00 119812.00 1015.36 1.19 5.06 0.00 5.06 3.43 80.90 sda 0.00 29718.00 0.00 235.00 0.00 119300.00 1015.32 0.87 3.69 0.00 3.69 2.99 70.20 md0 0.00 0.00 0.00 936.00 0.00 239616.00 512.00 229157.33 0.00 0.00 0.00 1.07 100.00 And few iostat sample of 4400755e356f9a2b0b7ceaa02f57b1c7546c3765(first bad commit): avg-cpu: %user %nice %system %iowait %steal %idle 0.02 0.00 1.09 1.54 0.00 97.35 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdb 1.00 27677.00 1.00 206.00 8.00 100516.00 971.25 27.40 130.56 196.00 130.24 4.72 97.70 sdc 0.00 27677.00 0.00 207.00 0.00 101028.00 976.12 27.05 129.43 0.00 129.43 4.61 95.50 sda 5.00 27677.00 1.00 211.00 16.00 102984.00 971.70 26.61 127.00 201.00 126.64 4.50 95.50 md0 0.00 0.00 0.00 824.00 0.00 210944.00 512.00 224122.02 0.00 0.00 0.00 1.21 100.00 avg-cpu: %user %nice %system %iowait %steal %idle 0.00 0.00 0.98 1.54 0.00 97.47 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdb 3.00 21203.00 1.00 218.00 16.00 107060.00 977.86 30.44 147.77 198.00 147.54 4.53 99.10 sdc 2.00 21203.00 2.00 220.00 16.00 108592.00 978.45 31.12 150.65 208.00 150.13 4.43 98.40 sda 0.00 21203.00 1.00 220.00 24.00 108020.00 977.77 30.56 150.88 197.00 150.67 4.38 96.80 md0 0.00 0.00 0.00 720.00 0.00 184320.00 512.00 224963.92 0.00 0.00 0.00 1.39 100.00 avg-cpu: %user %nice %system %iowait %steal %idle 0.02 0.00 0.96 1.63 0.00 97.39 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdb 11.00 29455.00 3.00 213.00 56.00 102958.00 953.83 31.19 134.97 205.00 133.99 4.56 98.40 sdc 0.00 29454.00 0.00 210.00 0.00 99890.00 951.33 29.36 127.07 0.00 127.07 4.36 91.60 sda 1.00 29454.00 0.00 215.00 0.00 103534.00 963.11 27.54 117.54 0.00 117.54 4.26 91.60 md0 0.00 0.00 0.00 876.00 0.00 224256.00 512.00 225993.60 0.00 0.00 0.00 1.14 100.10 > > 3 0.5 273.100 ±4.3% 3 0.6 223.567 ±6.5% -18.1% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-btrfs-4M-30G-NoSync > > 3 8.1 63.133 ±0.5% 3 9.2 55.633 ±0.2% -11.9% ivb44/fsmark/1x-1t-3HDD-RAID5-btrfs-4M-120G-NoSync > > 3 8.2 64.000 ±0.0% 3 9.2 57.600 ±0.0% -10.0% ivb44/fsmark/1x-64t-3HDD-RAID5-btrfs-4M-120G-NoSync > > Also, I'm a little confused by the > fsmark.time.involuntary_context_switches > statistic: > > > 1235 ± 2% -47.5% 649 ± 3% fsmark.time.percent_of_cpu_this_job_got > > > 399 ± 4% -20.0% 319 ± 3% fsmark.time.percent_of_cpu_this_job_got > > Does that means that the ext4 test changed from 12.4 cpus to 6.4, and that > the btrfs test chnages from 4 cpus to 3.2 ??? fsmark.time.percent_of_cpu_this_job_got is output from /usr/bin/time, which is from gnu time package. There is the explanation from source code: * P == percent of CPU this job got (total cpu time / elapsed time) --yliu > > Or does it just not mean anything? > > Thanks, > NeilBrown > > > > > > > > 26089f4902595a2f64c512066af07af6e82eb096 4400755e356f9a2b0b7ceaa02f57b1c7546c3765 > > ---------------------------------------- ---------------------------------------- > > run time(m) metric_value ±stddev run time(m) metric_value ±stddev change testbox/benchmark/sub-testcase > > --- ------ ---------------------------- --- ------ ---------------------------- -------- ------------------------------ > > 3 18.6 6.400 ±0.0% 5 9.2 19.200 ±0.0% 200.0% ivb44/fsmark/1x-64t-3HDD-RAID5-ext4-4M-40G-fsyncBeforeClose > > 3 24.7 6.400 ±0.0% 3 13.7 12.800 ±0.0% 100.0% ivb44/fsmark/1x-64t-3HDD-RAID5-f2fs-4M-40G-fsyncBeforeClose > > 3 17.5 28.267 ±9.6% 3 12.3 42.833 ±6.5% 51.5% ivb44/fsmark/1x-64t-3HDD-RAID5-f2fs-4M-120G-NoSync > > 3 16.7 30.700 ±1.5% 3 12.6 40.733 ±2.4% 32.7% ivb44/fsmark/1x-1t-3HDD-RAID5-f2fs-4M-120G-NoSync > > 3 29.0 5.867 ±0.8% 5 23.6 7.240 ±0.7% 23.4% ivb44/fsmark/1x-1t-3HDD-RAID5-btrfs-4M-40G-fsyncBeforeClose > > 3 28.5 6.000 ±0.0% 3 23.2 7.367 ±0.6% 22.8% ivb44/fsmark/1x-1t-3HDD-RAID5-f2fs-4M-40G-fsyncBeforeClose > > 5 11.7 14.600 ±0.0% 5 9.7 17.500 ±0.4% 19.9% ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-40G-fsyncBeforeClose > > 3 22.4 25.600 ±0.0% 5 17.9 30.120 ±4.1% 17.7% ivb44/fsmark/1x-64t-3HDD-RAID5-xfs-4M-120G-NoSync > > 5 10.8 47.320 ±0.6% 5 9.3 54.820 ±0.2% 15.8% ivb44/fsmark/1x-1t-3HDD-RAID5-xfs-4M-120G-NoSync > > 1 0.5 252.400 ±0.0% 1 0.5 263.300 ±0.0% 4.3% ivb44/fsmark/1x-1t-4BRD_12G-RAID5-ext4-4M-30G-NoSync > > > > 3 0.5 273.100 ±4.3% 3 0.6 223.567 ±6.5% -18.1% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-btrfs-4M-30G-NoSync > > 3 8.1 63.133 ±0.5% 3 9.2 55.633 ±0.2% -11.9% ivb44/fsmark/1x-1t-3HDD-RAID5-btrfs-4M-120G-NoSync > > 3 8.2 64.000 ±0.0% 3 9.2 57.600 ±0.0% -10.0% ivb44/fsmark/1x-64t-3HDD-RAID5-btrfs-4M-120G-NoSync > > > > > > NOTE: here are some more info about those test parameters for you to > > understand the testcase better: > > > > 1x: where 'x' means iterations or loop, corresponding to the 'L' option of fsmark > > 1t, 64t: where 't' means thread > > 4M: means the single file size, corresponding to the '-s' option of fsmark > > 40G, 30G, 120G: means the total test size > > > > 4BRD_12G: BRD is the ramdisk, where '4' means 4 ramdisk, and where '12G' means > > the size of one ramdisk. So, it would be 48G in total. And we made a > > raid on those ramdisk. > > > > > > As you can see from above data, interestingly, all performance > > regressions come from btrfs testing. That's why Chris is also > > in the cc list, with which just FYI. > > > > > > FYI, here I listed more detailed changes for the maximal postive and negtive changes. > > > > more detailed changes about ivb44/fsmark/1x-64t-3HDD-RAID5-ext4-4M-40G-fsyncBeforeClose > > --------- > > > > 26089f4902595a2f 4400755e356f9a2b0b7ceaa02f > > ---------------- -------------------------- > > %stddev %change %stddev > > \ | \ > > 6.40 ± 0% +200.0% 19.20 ± 0% fsmark.files_per_sec > > 1.015e+08 ± 1% -73.6% 26767355 ± 3% fsmark.time.voluntary_context_switches > > 13793 ± 1% -73.9% 3603 ± 5% fsmark.time.system_time > > 78473 ± 6% -64.3% 28016 ± 7% fsmark.time.involuntary_context_switches > > 15789555 ± 9% -54.7% 7159485 ± 13% fsmark.app_overhead > > 1115 ± 0% -50.3% 554 ± 1% fsmark.time.elapsed_time.max > > 1115 ± 0% -50.3% 554 ± 1% fsmark.time.elapsed_time > > 1235 ± 2% -47.5% 649 ± 3% fsmark.time.percent_of_cpu_this_job_got > > 456465 ± 1% -26.7% 334594 ± 4% fsmark.time.minor_page_faults > > 275 ± 0% +1257.7% 3733 ± 2% slabinfo.raid5-md0.num_objs > > 275 ± 0% +1257.7% 3733 ± 2% slabinfo.raid5-md0.active_objs > > 11 ± 0% +1250.9% 148 ± 2% slabinfo.raid5-md0.active_slabs > > 11 ± 0% +1250.9% 148 ± 2% slabinfo.raid5-md0.num_slabs > > 2407 ± 4% +293.4% 9471 ± 26% numa-meminfo.node0.Writeback > > 600 ± 4% +294.9% 2372 ± 26% numa-vmstat.node0.nr_writeback > > 1114505 ± 0% -77.4% 251696 ± 2% softirqs.TASKLET > > 1808027 ± 1% -77.7% 402378 ± 4% softirqs.RCU > > 12158665 ± 1% -77.1% 2786069 ± 4% cpuidle.C3-IVT.usage > > 1119433 ± 0% -77.3% 254192 ± 2% softirqs.BLOCK > > 37824202 ± 1% -75.1% 9405078 ± 4% cpuidle.C6-IVT.usage > > 1.015e+08 ± 1% -73.6% 26767355 ± 3% time.voluntary_context_switches > > 13793 ± 1% -73.9% 3603 ± 5% time.system_time > > 5971084 ± 1% -73.6% 1574912 ± 5% softirqs.SCHED > > 10539492 ± 3% -72.0% 2956258 ± 6% cpuidle.C1E-IVT.usage > > 2 ± 0% +230.0% 6 ± 12% vmstat.procs.b > > 14064 ± 1% -71.2% 4049 ± 6% softirqs.HRTIMER > > 7388306 ± 1% -71.2% 2129929 ± 4% softirqs.TIMER > > 3.496e+09 ± 1% -70.3% 1.04e+09 ± 1% cpuidle.C3-IVT.time > > 0.88 ± 6% +224.9% 2.87 ± 11% turbostat.Pkg%pc6 > > 19969464 ± 2% -66.2% 6750675 ± 5% cpuidle.C1-IVT.usage > > 78473 ± 6% -64.3% 28016 ± 7% time.involuntary_context_switches > > 4.23 ± 5% +181.4% 11.90 ± 3% turbostat.Pkg%pc2 > > 2.551e+09 ± 1% -61.4% 9.837e+08 ± 3% cpuidle.C1E-IVT.time > > 8084 ± 3% +142.6% 19608 ± 3% meminfo.Writeback > > 2026 ± 4% +141.6% 4895 ± 4% proc-vmstat.nr_writeback > > 165 ± 4% -56.9% 71 ± 14% numa-vmstat.node1.nr_inactive_anon > > 7.748e+09 ± 3% -50.3% 3.852e+09 ± 3% cpuidle.C1-IVT.time > > 175 ± 5% -53.2% 82 ± 13% numa-vmstat.node1.nr_shmem > > 1115 ± 0% -50.3% 554 ± 1% time.elapsed_time.max > > 1115 ± 0% -50.3% 554 ± 1% time.elapsed_time > > 1147 ± 0% -49.0% 585 ± 1% uptime.boot > > 2260889 ± 0% -48.8% 1157272 ± 1% proc-vmstat.pgfree > > 16805 ± 2% -35.9% 10776 ± 23% numa-vmstat.node1.nr_dirty > > 1235 ± 2% -47.5% 649 ± 3% time.percent_of_cpu_this_job_got > > 67245 ± 2% -35.9% 43122 ± 23% numa-meminfo.node1.Dirty > > 39041 ± 0% -45.7% 21212 ± 2% uptime.idle > > 13 ± 9% -49.0% 6 ± 11% vmstat.procs.r > > 3072 ± 10% -40.3% 1833 ± 9% cpuidle.POLL.usage > > 3045115 ± 0% -46.1% 1642053 ± 1% proc-vmstat.pgfault > > 202 ± 1% -45.2% 110 ± 0% proc-vmstat.nr_inactive_anon > > 4583079 ± 2% -31.4% 3143602 ± 16% numa-vmstat.node1.numa_hit > > 28.03 ± 0% +69.1% 47.39 ± 1% turbostat.CPU%c6 > > 223 ± 1% -41.1% 131 ± 1% proc-vmstat.nr_shmem > > 4518820 ± 3% -30.8% 3128304 ± 16% numa-vmstat.node1.numa_local > > 3363496 ± 3% -27.4% 2441619 ± 20% numa-vmstat.node1.nr_dirtied > > 3345346 ± 3% -27.4% 2428396 ± 20% numa-vmstat.node1.nr_written > > 0.18 ± 18% +105.6% 0.37 ± 36% turbostat.Pkg%pc3 > > 3427913 ± 3% -27.3% 2492563 ± 20% numa-vmstat.node1.nr_inactive_file > > 13712431 ± 3% -27.3% 9971152 ± 20% numa-meminfo.node1.Inactive > > 13711768 ± 3% -27.3% 9970866 ± 20% numa-meminfo.node1.Inactive(file) > > 3444598 ± 3% -27.2% 2508920 ± 20% numa-vmstat.node1.nr_file_pages > > 13778510 ± 3% -27.2% 10036287 ± 20% numa-meminfo.node1.FilePages > > 8819175 ± 1% -28.3% 6320188 ± 19% numa-numastat.node1.numa_hit > > 8819051 ± 1% -28.3% 6320152 ± 19% numa-numastat.node1.local_node > > 14350918 ± 3% -26.8% 10504070 ± 19% numa-meminfo.node1.MemUsed > > 100892 ± 3% -26.0% 74623 ± 19% numa-vmstat.node1.nr_slab_reclaimable > > 403571 ± 3% -26.0% 298513 ± 19% numa-meminfo.node1.SReclaimable > > 3525 ± 13% +36.6% 4817 ± 14% slabinfo.blkdev_requests.active_objs > > 3552 ± 13% +36.3% 4841 ± 14% slabinfo.blkdev_requests.num_objs > > 30779 ± 4% -34.7% 20084 ± 12% proc-vmstat.pgmigrate_success > > 30779 ± 4% -34.7% 20084 ± 12% proc-vmstat.numa_pages_migrated > > 447400 ± 2% -23.2% 343701 ± 16% numa-meminfo.node1.Slab > > 2.532e+10 ± 0% -33.1% 1.694e+10 ± 1% cpuidle.C6-IVT.time > > 3081 ± 9% +28.0% 3945 ± 12% slabinfo.mnt_cache.num_objs > > 3026 ± 9% +28.8% 3898 ± 12% slabinfo.mnt_cache.active_objs > > 5822 ± 4% +77.8% 10350 ± 25% numa-meminfo.node1.Writeback > > 1454 ± 4% +77.3% 2579 ± 25% numa-vmstat.node1.nr_writeback > > 424984 ± 1% -26.5% 312255 ± 3% proc-vmstat.numa_pte_updates > > 368001 ± 1% -26.8% 269440 ± 3% proc-vmstat.numa_hint_faults > > 456465 ± 1% -26.7% 334594 ± 4% time.minor_page_faults > > 3.86 ± 3% -24.4% 2.92 ± 2% turbostat.CPU%c3 > > 4661151 ± 2% +20.6% 5622999 ± 9% numa-vmstat.node1.nr_free_pages > > 18644452 ± 2% +20.6% 22491300 ± 9% numa-meminfo.node1.MemFree > > 876 ± 2% +28.2% 1124 ± 5% slabinfo.kmalloc-4096.num_objs > > 858 ± 3% +24.0% 1064 ± 5% slabinfo.kmalloc-4096.active_objs > > 17767832 ± 8% -25.4% 13249545 ± 17% cpuidle.POLL.time > > 285093 ± 1% -23.1% 219372 ± 5% proc-vmstat.numa_hint_faults_local > > 105423 ± 2% -16.1% 88498 ± 0% meminfo.Dirty > > 26365 ± 1% -16.0% 22152 ± 1% proc-vmstat.nr_dirty > > 41.04 ± 1% -14.1% 35.26 ± 1% turbostat.CPU%c1 > > 9385 ± 4% -14.3% 8043 ± 6% slabinfo.kmalloc-192.active_objs > > 9574 ± 3% -13.9% 8241 ± 6% slabinfo.kmalloc-192.num_objs > > 2411 ± 3% +17.0% 2820 ± 4% slabinfo.kmalloc-2048.active_objs > > 12595574 ± 0% -10.0% 11338368 ± 1% proc-vmstat.pgalloc_normal > > 5262 ± 1% +13.3% 5962 ± 1% slabinfo.kmalloc-1024.num_objs > > 5262 ± 1% +12.7% 5932 ± 1% slabinfo.kmalloc-1024.active_objs > > 2538 ± 3% +13.7% 2885 ± 4% slabinfo.kmalloc-2048.num_objs > > 5299546 ± 0% -9.9% 4776351 ± 0% slabinfo.buffer_head.active_objs > > 5299546 ± 0% -9.9% 4776351 ± 0% slabinfo.buffer_head.num_objs > > 135885 ± 0% -9.9% 122470 ± 0% slabinfo.buffer_head.num_slabs > > 135885 ± 0% -9.9% 122470 ± 0% slabinfo.buffer_head.active_slabs > > 28.04 ± 2% +715.6% 228.69 ± 3% iostat.sdb.avgrq-sz > > 28.05 ± 2% +708.1% 226.72 ± 2% iostat.sdc.avgrq-sz > > 2245 ± 3% -81.6% 413 ± 1% iostat.sda.w/s > > 5.33 ± 1% +1008.2% 59.07 ± 1% iostat.sda.w_await > > 5.85 ± 1% +1126.4% 71.69 ± 4% iostat.sda.r_await > > 5.36 ± 1% +978.6% 57.79 ± 3% iostat.sdc.w_await > > 1263 ± 4% -85.8% 179 ± 6% iostat.sdc.r/s > > 2257 ± 3% -81.6% 414 ± 2% iostat.sdb.w/s > > 1264 ± 4% -85.8% 179 ± 6% iostat.sdb.r/s > > 5.55 ± 0% +1024.2% 62.37 ± 4% iostat.sdb.await > > 5.89 ± 1% +1125.9% 72.16 ± 6% iostat.sdb.r_await > > 5.36 ± 0% +1014.3% 59.75 ± 3% iostat.sdb.w_await > > 5.57 ± 1% +987.9% 60.55 ± 3% iostat.sdc.await > > 5.51 ± 0% +1017.3% 61.58 ± 1% iostat.sda.await > > 1264 ± 4% -85.8% 179 ± 6% iostat.sda.r/s > > 28.09 ± 2% +714.2% 228.73 ± 2% iostat.sda.avgrq-sz > > 5.95 ± 2% +1091.0% 70.82 ± 6% iostat.sdc.r_await > > 2252 ± 3% -81.5% 417 ± 2% iostat.sdc.w/s > > 4032 ± 2% +151.6% 10143 ± 1% iostat.sdb.wrqm/s > > 4043 ± 2% +151.0% 10150 ± 1% iostat.sda.wrqm/s > > 4035 ± 2% +151.2% 10138 ± 1% iostat.sdc.wrqm/s > > 26252 ± 1% -54.0% 12077 ± 4% vmstat.system.in > > 37813 ± 0% +101.0% 75998 ± 1% vmstat.io.bo > > 37789 ± 0% +101.0% 75945 ± 1% iostat.md0.wkB/s > > 205 ± 0% +96.1% 402 ± 1% iostat.md0.w/s > > 164286 ± 1% -46.2% 88345 ± 2% vmstat.system.cs > > 27.07 ± 2% -46.7% 14.42 ± 3% turbostat.%Busy > > 810 ± 2% -46.7% 431 ± 3% turbostat.Avg_MHz > > 15.56 ± 2% +71.7% 26.71 ± 1% iostat.sda.avgqu-sz > > 15.65 ± 2% +69.1% 26.46 ± 2% iostat.sdc.avgqu-sz > > 15.67 ± 2% +72.7% 27.06 ± 2% iostat.sdb.avgqu-sz > > 25151 ± 0% +68.3% 42328 ± 1% iostat.sda.wkB/s > > 25153 ± 0% +68.2% 42305 ± 1% iostat.sdb.wkB/s > > 25149 ± 0% +68.2% 42292 ± 1% iostat.sdc.wkB/s > > 97.45 ± 0% -21.1% 76.90 ± 0% turbostat.CorWatt > > 12517 ± 0% -20.2% 9994 ± 1% iostat.sdc.rkB/s > > 12517 ± 0% -20.0% 10007 ± 1% iostat.sda.rkB/s > > 12512 ± 0% -19.9% 10018 ± 1% iostat.sdb.rkB/s > > 1863 ± 3% +24.7% 2325 ± 1% iostat.sdb.rrqm/s > > 1865 ± 3% +24.3% 2319 ± 1% iostat.sdc.rrqm/s > > 1864 ± 3% +24.6% 2322 ± 1% iostat.sda.rrqm/s > > 128 ± 0% -16.4% 107 ± 0% turbostat.PkgWatt > > 150569 ± 0% -8.7% 137525 ± 0% iostat.md0.avgqu-sz > > 4.29 ± 0% -5.1% 4.07 ± 0% turbostat.RAMWatt > > > > > > more detailed changes about ivb44/fsmark/1x-64t-4BRD_12G-RAID5-btrfs-4M-30G-NoSync > > --------- > > > > 26089f4902595a2f 4400755e356f9a2b0b7ceaa02f > > ---------------- -------------------------- > > %stddev %change %stddev > > \ | \ > > 273 ± 4% -18.1% 223 ± 6% fsmark.files_per_sec > > 29.24 ± 1% +27.2% 37.20 ± 8% fsmark.time.elapsed_time.max > > 29.24 ± 1% +27.2% 37.20 ± 8% fsmark.time.elapsed_time > > 399 ± 4% -20.0% 319 ± 3% fsmark.time.percent_of_cpu_this_job_got > > 129891 ± 20% -28.9% 92334 ± 15% fsmark.time.voluntary_context_switches > > 266 ± 0% +413.4% 1365 ± 5% slabinfo.raid5-md0.num_objs > > 266 ± 0% +413.4% 1365 ± 5% slabinfo.raid5-md0.active_objs > > 0.23 ± 27% +98.6% 0.46 ± 35% turbostat.CPU%c3 > > 56612063 ± 9% +36.7% 77369763 ± 20% cpuidle.C1-IVT.time > > 5579498 ± 14% -36.0% 3571516 ± 6% cpuidle.C1E-IVT.time > > 4668 ± 38% +64.7% 7690 ± 19% numa-vmstat.node0.nr_unevictable > > 18674 ± 38% +64.7% 30762 ± 19% numa-meminfo.node0.Unevictable > > 9298 ± 37% +64.4% 15286 ± 19% proc-vmstat.nr_unevictable > > 4629 ± 37% +64.1% 7596 ± 19% numa-vmstat.node1.nr_unevictable > > 18535 ± 37% +63.9% 30385 ± 19% numa-meminfo.node1.Unevictable > > 4270894 ± 19% +65.6% 7070923 ± 21% cpuidle.C3-IVT.time > > 38457 ± 37% +59.0% 61148 ± 19% meminfo.Unevictable > > 3748226 ± 17% +26.6% 4743674 ± 16% numa-vmstat.node0.numa_local > > 4495283 ± 13% -24.8% 3382315 ± 17% numa-vmstat.node0.nr_free_pages > > 3818432 ± 16% +26.5% 4830938 ± 16% numa-vmstat.node0.numa_hit > > 17966826 ± 13% -24.7% 13537228 ± 17% numa-meminfo.node0.MemFree > > 14901309 ± 15% +29.7% 19330906 ± 12% numa-meminfo.node0.MemUsed > > 26 ± 21% -32.9% 17 ± 14% cpuidle.POLL.usage > > 1.183e+09 ± 1% +29.6% 1.533e+09 ± 8% cpuidle.C6-IVT.time > > 29.24 ± 1% +27.2% 37.20 ± 8% time.elapsed_time > > 29.24 ± 1% +27.2% 37.20 ± 8% time.elapsed_time.max > > 399 ± 4% -20.0% 319 ± 3% time.percent_of_cpu_this_job_got > > 850 ± 4% -8.6% 777 ± 5% slabinfo.blkdev_requests.num_objs > > 850 ± 4% -8.6% 777 ± 5% slabinfo.blkdev_requests.active_objs > > 14986 ± 9% +17.1% 17548 ± 8% numa-vmstat.node0.nr_slab_reclaimable > > 11943 ± 5% -12.6% 10441 ± 2% slabinfo.kmalloc-192.num_objs > > 59986 ± 9% +17.0% 70186 ± 8% numa-meminfo.node0.SReclaimable > > 3703 ± 6% +10.2% 4082 ± 7% slabinfo.btrfs_delayed_data_ref.num_objs > > 133551 ± 6% +16.1% 154995 ± 1% proc-vmstat.pgfault > > 129891 ± 20% -28.9% 92334 ± 15% time.voluntary_context_switches > > 11823 ± 4% -12.0% 10409 ± 3% slabinfo.kmalloc-192.active_objs > > 3703 ± 6% +9.7% 4061 ± 7% slabinfo.btrfs_delayed_data_ref.active_objs > > 19761 ± 2% -11.2% 17542 ± 6% slabinfo.anon_vma.active_objs > > 19761 ± 2% -11.2% 17544 ± 6% slabinfo.anon_vma.num_objs > > 13002 ± 3% +14.9% 14944 ± 5% slabinfo.kmalloc-256.num_objs > > 12695 ± 3% +13.8% 14446 ± 7% slabinfo.kmalloc-256.active_objs > > 1190 ± 1% -11.8% 1050 ± 3% slabinfo.mnt_cache.num_objs > > 1190 ± 1% -11.8% 1050 ± 3% slabinfo.mnt_cache.active_objs > > 136862 ± 1% -13.8% 117938 ± 7% cpuidle.C6-IVT.usage > > 1692630 ± 3% +12.3% 1900854 ± 0% numa-vmstat.node0.nr_written > > 1056 ± 2% +8.8% 1149 ± 3% slabinfo.mm_struct.active_objs > > 1056 ± 2% +8.8% 1149 ± 3% slabinfo.mm_struct.num_objs > > 24029 ± 11% -30.6% 16673 ± 8% vmstat.system.cs > > 8859 ± 2% -15.0% 7530 ± 8% vmstat.system.in > > 905630 ± 2% -16.8% 753097 ± 4% iostat.md0.wkB/s > > 906433 ± 2% -16.9% 753482 ± 4% vmstat.io.bo > > 3591 ± 2% -16.9% 2982 ± 4% iostat.md0.w/s > > 13.22 ± 5% -16.3% 11.07 ± 1% turbostat.%Busy > > 402 ± 4% -15.9% 338 ± 1% turbostat.Avg_MHz > > 54236 ± 3% +10.4% 59889 ± 4% iostat.md0.avgqu-sz > > 7.67 ± 1% +4.5% 8.01 ± 1% turbostat.RAMWatt > > > > > > > > --yliu > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > Please read the FAQ at http://www.tux.org/lkml/ > ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2015-03-26 4:30 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-03-18 5:00 performance changes on 4400755e: 200.0% fsmark.files_per_sec, -18.1% fsmark.files_per_sec, and few more Yuanahn Liu 2015-03-18 5:00 ` Yuanahn Liu 2015-03-25 3:03 ` NeilBrown 2015-03-25 3:03 ` NeilBrown 2015-03-26 4:30 ` Yuanhan Liu 2015-03-26 4:30 ` Yuanhan Liu
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.