* [LKP] [RAID5] 878ee679279: -1.8% vmstat.io.bo, +40.5% perf-stat.LLC-load-misses
@ 2015-04-23 6:55 Huang Ying
2015-04-24 2:15 ` NeilBrown
0 siblings, 1 reply; 3+ messages in thread
From: Huang Ying @ 2015-04-23 6:55 UTC (permalink / raw)
To: shli@kernel.org; +Cc: NeilBrown, LKML, LKP ML
[-- Attachment #1: Type: text/plain, Size: 14775 bytes --]
FYI, we noticed the below changes on
git://neil.brown.name/md for-next
commit 878ee6792799e2f88bdcac329845efadb205252f ("RAID5: batch adjacent full stripe write")
testbox/testcase/testparams: lkp-st02/dd-write/300-5m-11HDD-RAID5-cfq-xfs-1dd
a87d7f782b47e030 878ee6792799e2f88bdcac3298
---------------- --------------------------
%stddev %change %stddev
\ | \
59035 ± 0% +18.4% 69913 ± 1% softirqs.SCHED
1330 ± 10% +17.4% 1561 ± 4% slabinfo.kmalloc-512.num_objs
1330 ± 10% +17.4% 1561 ± 4% slabinfo.kmalloc-512.active_objs
305908 ± 0% -1.8% 300427 ± 0% vmstat.io.bo
1 ± 0% +100.0% 2 ± 0% vmstat.procs.r
8266 ± 1% -15.7% 6968 ± 0% vmstat.system.cs
14819 ± 0% -2.1% 14503 ± 0% vmstat.system.in
18.20 ± 6% +10.2% 20.05 ± 4% perf-profile.cpu-cycles.raid_run_ops.handle_stripe.handle_active_stripes.raid5d.md_thread
1.94 ± 9% +90.6% 3.70 ± 9% perf-profile.cpu-cycles.async_xor.raid_run_ops.handle_stripe.handle_active_stripes.raid5d
0.00 ± 0% +Inf% 25.18 ± 3% perf-profile.cpu-cycles.handle_active_stripes.isra.45.raid5d.md_thread.kthread.ret_from_fork
0.00 ± 0% +Inf% 14.14 ± 4% perf-profile.cpu-cycles.async_copy_data.isra.42.raid_run_ops.handle_stripe.handle_active_stripes.raid5d
1.79 ± 7% +102.9% 3.64 ± 9% perf-profile.cpu-cycles.xor_blocks.async_xor.raid_run_ops.handle_stripe.handle_active_stripes
3.09 ± 4% -10.8% 2.76 ± 4% perf-profile.cpu-cycles.get_active_stripe.make_request.md_make_request.generic_make_request.submit_bio
0.80 ± 14% +28.1% 1.02 ± 10% perf-profile.cpu-cycles.mutex_lock.xfs_file_buffered_aio_write.xfs_file_write_iter.new_sync_write.vfs_write
14.78 ± 6% -100.0% 0.00 ± 0% perf-profile.cpu-cycles.async_copy_data.isra.38.raid_run_ops.handle_stripe.handle_active_stripes.raid5d
25.68 ± 4% -100.0% 0.00 ± 0% perf-profile.cpu-cycles.handle_active_stripes.isra.41.raid5d.md_thread.kthread.ret_from_fork
1.23 ± 5% +140.0% 2.96 ± 7% perf-profile.cpu-cycles.xor_sse_5_pf64.xor_blocks.async_xor.raid_run_ops.handle_stripe
2.62 ± 6% -95.6% 0.12 ± 33% perf-profile.cpu-cycles.analyse_stripe.handle_stripe.handle_active_stripes.raid5d.md_thread
0.96 ± 9% +17.5% 1.12 ± 2% perf-profile.cpu-cycles.xfs_ilock.xfs_file_buffered_aio_write.xfs_file_write_iter.new_sync_write.vfs_write
1.461e+10 ± 0% -5.3% 1.384e+10 ± 1% perf-stat.L1-dcache-load-misses
3.688e+11 ± 0% -2.7% 3.59e+11 ± 0% perf-stat.L1-dcache-loads
1.124e+09 ± 0% -27.7% 8.125e+08 ± 0% perf-stat.L1-dcache-prefetches
2.767e+10 ± 0% -1.8% 2.717e+10 ± 0% perf-stat.L1-dcache-store-misses
2.352e+11 ± 0% -2.8% 2.287e+11 ± 0% perf-stat.L1-dcache-stores
6.774e+09 ± 0% -2.3% 6.62e+09 ± 0% perf-stat.L1-icache-load-misses
5.571e+08 ± 0% +40.5% 7.826e+08 ± 1% perf-stat.LLC-load-misses
6.263e+09 ± 0% -13.7% 5.407e+09 ± 1% perf-stat.LLC-loads
1.914e+11 ± 0% -4.2% 1.833e+11 ± 0% perf-stat.branch-instructions
1.145e+09 ± 2% -5.6% 1.081e+09 ± 0% perf-stat.branch-load-misses
1.911e+11 ± 0% -4.3% 1.829e+11 ± 0% perf-stat.branch-loads
1.142e+09 ± 2% -5.1% 1.083e+09 ± 0% perf-stat.branch-misses
1.218e+09 ± 0% +19.8% 1.46e+09 ± 0% perf-stat.cache-misses
2.118e+10 ± 0% -5.2% 2.007e+10 ± 0% perf-stat.cache-references
2510308 ± 1% -15.7% 2115410 ± 0% perf-stat.context-switches
39623 ± 0% +22.1% 48370 ± 1% perf-stat.cpu-migrations
4.179e+08 ± 40% +165.7% 1.111e+09 ± 35% perf-stat.dTLB-load-misses
3.684e+11 ± 0% -2.5% 3.592e+11 ± 0% perf-stat.dTLB-loads
1.232e+08 ± 15% +62.5% 2.002e+08 ± 27% perf-stat.dTLB-store-misses
2.348e+11 ± 0% -2.5% 2.288e+11 ± 0% perf-stat.dTLB-stores
3577297 ± 2% +8.7% 3888986 ± 1% perf-stat.iTLB-load-misses
1.035e+12 ± 0% -3.5% 9.988e+11 ± 0% perf-stat.iTLB-loads
1.036e+12 ± 0% -3.7% 9.978e+11 ± 0% perf-stat.instructions
594 ± 30% +130.3% 1369 ± 13% sched_debug.cfs_rq[0]:/.blocked_load_avg
17 ± 10% -28.2% 12 ± 23% sched_debug.cfs_rq[0]:/.nr_spread_over
210 ± 21% +42.1% 298 ± 28% sched_debug.cfs_rq[0]:/.tg_runnable_contrib
9676 ± 21% +42.1% 13754 ± 28% sched_debug.cfs_rq[0]:/.avg->runnable_avg_sum
772 ± 25% +116.5% 1672 ± 9% sched_debug.cfs_rq[0]:/.tg_load_contrib
8402 ± 9% +83.3% 15405 ± 11% sched_debug.cfs_rq[0]:/.tg_load_avg
8356 ± 9% +82.8% 15272 ± 11% sched_debug.cfs_rq[1]:/.tg_load_avg
968 ± 25% +100.8% 1943 ± 14% sched_debug.cfs_rq[1]:/.blocked_load_avg
16242 ± 9% -22.2% 12643 ± 14% sched_debug.cfs_rq[1]:/.avg->runnable_avg_sum
353 ± 9% -22.1% 275 ± 14% sched_debug.cfs_rq[1]:/.tg_runnable_contrib
1183 ± 23% +77.7% 2102 ± 12% sched_debug.cfs_rq[1]:/.tg_load_contrib
181 ± 8% -31.4% 124 ± 26% sched_debug.cfs_rq[2]:/.tg_runnable_contrib
8364 ± 8% -31.3% 5745 ± 26% sched_debug.cfs_rq[2]:/.avg->runnable_avg_sum
8297 ± 9% +81.7% 15079 ± 12% sched_debug.cfs_rq[2]:/.tg_load_avg
30439 ± 13% -45.2% 16681 ± 26% sched_debug.cfs_rq[2]:/.exec_clock
39735 ± 14% -48.3% 20545 ± 29% sched_debug.cfs_rq[2]:/.min_vruntime
8231 ± 10% +82.2% 15000 ± 12% sched_debug.cfs_rq[3]:/.tg_load_avg
1210 ± 14% +110.3% 2546 ± 30% sched_debug.cfs_rq[4]:/.tg_load_contrib
8188 ± 10% +82.8% 14964 ± 12% sched_debug.cfs_rq[4]:/.tg_load_avg
8132 ± 10% +83.1% 14890 ± 12% sched_debug.cfs_rq[5]:/.tg_load_avg
749 ± 29% +205.9% 2292 ± 34% sched_debug.cfs_rq[5]:/.blocked_load_avg
963 ± 30% +169.9% 2599 ± 33% sched_debug.cfs_rq[5]:/.tg_load_contrib
37791 ± 32% -38.6% 23209 ± 13% sched_debug.cfs_rq[6]:/.min_vruntime
693 ± 25% +132.2% 1609 ± 29% sched_debug.cfs_rq[6]:/.blocked_load_avg
10838 ± 13% -39.2% 6587 ± 13% sched_debug.cfs_rq[6]:/.avg->runnable_avg_sum
29329 ± 27% -33.2% 19577 ± 10% sched_debug.cfs_rq[6]:/.exec_clock
235 ± 14% -39.7% 142 ± 14% sched_debug.cfs_rq[6]:/.tg_runnable_contrib
8085 ± 10% +83.6% 14848 ± 12% sched_debug.cfs_rq[6]:/.tg_load_avg
839 ± 25% +128.5% 1917 ± 18% sched_debug.cfs_rq[6]:/.tg_load_contrib
8051 ± 10% +83.6% 14779 ± 12% sched_debug.cfs_rq[7]:/.tg_load_avg
156 ± 34% +97.9% 309 ± 19% sched_debug.cpu#0.cpu_load[4]
160 ± 25% +64.0% 263 ± 16% sched_debug.cpu#0.cpu_load[2]
156 ± 32% +83.7% 286 ± 17% sched_debug.cpu#0.cpu_load[3]
164 ± 20% -35.1% 106 ± 31% sched_debug.cpu#2.cpu_load[0]
249 ± 15% +80.2% 449 ± 10% sched_debug.cpu#4.cpu_load[3]
231 ± 11% +101.2% 466 ± 13% sched_debug.cpu#4.cpu_load[2]
217 ± 14% +189.9% 630 ± 38% sched_debug.cpu#4.cpu_load[0]
71951 ± 5% +21.6% 87526 ± 7% sched_debug.cpu#4.nr_load_updates
214 ± 8% +146.1% 527 ± 27% sched_debug.cpu#4.cpu_load[1]
256 ± 17% +75.7% 449 ± 13% sched_debug.cpu#4.cpu_load[4]
209 ± 23% +98.3% 416 ± 48% sched_debug.cpu#5.cpu_load[2]
68024 ± 2% +18.8% 80825 ± 1% sched_debug.cpu#5.nr_load_updates
217 ± 26% +74.9% 380 ± 45% sched_debug.cpu#5.cpu_load[3]
852 ± 21% -38.3% 526 ± 22% sched_debug.cpu#6.curr->pid
lkp-st02: Core2
Memory: 8G
perf-stat.cache-misses
1.6e+09 O+-----O--O---O--O---O--------------------------------------------+
| O O O O O O O O O O |
1.4e+09 ++ |
1.2e+09 *+.*...* *..* * *...*..*...*..*...*..*...*..*...*..*
| : : : : : |
1e+09 ++ : : : : : : |
| : : : : : : |
8e+08 ++ : : : : : : |
| : : : : : : |
6e+08 ++ : : : : : : |
4e+08 ++ : : : : : : |
| : : : : : : |
2e+08 ++ : : : : : : |
| : : : |
0 ++-O------*----------*------*-------------------------------------+
perf-stat.L1-dcache-prefetches
1.2e+09 ++----------------------------------------------------------------+
*..*...* *..* * ..*.. ..*..*...*..*...*..*...*..*
1e+09 ++ : : : : *. *. |
| : : : :: : |
| : : : : : : O |
8e+08 O+ O: O :O O: O :O: O :O O O O O O O |
| : : : : : : |
6e+08 ++ : : : : : : |
| : : : : : : |
4e+08 ++ : : : : : : |
| : : : : : : |
| : : : : : : |
2e+08 ++ :: :: : : |
| : : : |
0 ++-O------*----------*------*-------------------------------------+
perf-stat.LLC-load-misses
1e+09 ++------------------------------------------------------------------+
9e+08 O+ O O O O O |
| O O O O |
8e+08 ++ O O O O O O |
7e+08 ++ |
| |
6e+08 *+..*..* *...* * *...*..*...*...*..*...*..*...*..*...*
5e+08 ++ : : : :: : |
4e+08 ++ : : : : : : |
| : : : : : : |
3e+08 ++ : : : : : : |
2e+08 ++ : : : : : : |
| : : : : : : |
1e+08 ++ : :: : |
0 ++--O------*---------*-------*--------------------------------------+
perf-stat.context-switches
3e+06 ++----------------------------------------------------------------+
| *...*..*... |
2.5e+06 *+.*...* *..* * : *..*... .*...*..*... .*
| : : : : : *. *. |
O O: O :O O: O :: : O O O O O O |
2e+06 ++ : : : :O: O :O O |
| : : : : : : |
1.5e+06 ++ : : : : : : |
| : : : : : : |
1e+06 ++ : : : : : : |
| : : : : : : |
| : : : : : : |
500000 ++ :: : : :: |
| : : : |
0 ++-O------*----------*------*-------------------------------------+
vmstat.system.cs
10000 ++------------------------------------------------------------------+
9000 ++ *...*.. |
*...*..* *...* * : *...*...*.. ..*..*...*.. ..*
8000 ++ : : : : : *. *. |
7000 O+ O: O O O: O : : : O O O O O O |
| : : : :O: O :O O |
6000 ++ : : : : : : |
5000 ++ : : : : : : |
4000 ++ : : : : : : |
| : : : : : : |
3000 ++ : : : : : : |
2000 ++ : : : : : : |
| : : :: :: |
1000 ++ : : : |
0 ++--O------*---------*-------*--------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
To reproduce:
apt-get install ruby
git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
cd lkp-tests
bin/setup-local job.yaml # the job file attached in this email
bin/run-local job.yaml
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Ying Huang
[-- Attachment #2: job.yaml --]
[-- Type: text/plain, Size: 3296 bytes --]
---
testcase: dd-write
default-monitors:
wait: pre-test
uptime:
iostat:
vmstat:
numa-numastat:
numa-vmstat:
numa-meminfo:
proc-vmstat:
proc-stat:
meminfo:
slabinfo:
interrupts:
lock_stat:
latency_stats:
softirqs:
bdi_dev_mapping:
diskstats:
nfsstat:
cpuidle:
cpufreq-stats:
turbostat:
pmeter:
sched_debug:
interval: 10
default-watchdogs:
watch-oom:
watchdog:
cpufreq_governor:
commit: a1a71cc4c0a53e29fe27cede9392b0ad816ee956
model: Core2
memory: 8G
nr_hdd_partitions: 12
wait_disks_timeout: 300
hdd_partitions: "/dev/disk/by-id/scsi-35000c5000???????"
swap_partitions:
runtime: 5m
disk: 11HDD
md: RAID5
iosched: cfq
fs: xfs
fs2:
monitors:
perf-stat:
perf-profile:
ftrace:
events: balance_dirty_pages bdi_dirty_ratelimit global_dirty_state writeback_single_inode
nr_threads: 1dd
dd:
testbox: lkp-st02
tbox_group: lkp-st02
kconfig: x86_64-rhel
enqueue_time: 2015-04-19 11:59:58.120063120 +08:00
head_commit: a1a71cc4c0a53e29fe27cede9392b0ad816ee956
base_commit: 39a8804455fb23f09157341d3ba7db6d7ae6ee76
branch: linux-devel/devel-hourly-2015042014
kernel: "/kernel/x86_64-rhel/a1a71cc4c0a53e29fe27cede9392b0ad816ee956/vmlinuz-4.0.0-09109-ga1a71cc"
user: lkp
queue: cyclic
rootfs: debian-x86_64-2015-02-07.cgz
result_root: "/result/lkp-st02/dd-write/300-5m-11HDD-RAID5-cfq-xfs-1dd/debian-x86_64-2015-02-07.cgz/x86_64-rhel/a1a71cc4c0a53e29fe27cede9392b0ad816ee956/0"
LKP_SERVER: inn
job_file: "/lkp/scheduled/lkp-st02/cyclic_dd-write-300-5m-11HDD-RAID5-cfq-xfs-1dd-x86_64-rhel-HEAD-a1a71cc4c0a53e29fe27cede9392b0ad816ee956-0-20150419-35022-17ddag2.yaml"
dequeue_time: 2015-04-20 16:17:46.635323077 +08:00
nr_cpu: "$(nproc)"
initrd: "/osimage/debian/debian-x86_64-2015-02-07.cgz"
bootloader_append:
- root=/dev/ram0
- user=lkp
- job=/lkp/scheduled/lkp-st02/cyclic_dd-write-300-5m-11HDD-RAID5-cfq-xfs-1dd-x86_64-rhel-HEAD-a1a71cc4c0a53e29fe27cede9392b0ad816ee956-0-20150419-35022-17ddag2.yaml
- ARCH=x86_64
- kconfig=x86_64-rhel
- branch=linux-devel/devel-hourly-2015042014
- commit=a1a71cc4c0a53e29fe27cede9392b0ad816ee956
- BOOT_IMAGE=/kernel/x86_64-rhel/a1a71cc4c0a53e29fe27cede9392b0ad816ee956/vmlinuz-4.0.0-09109-ga1a71cc
- RESULT_ROOT=/result/lkp-st02/dd-write/300-5m-11HDD-RAID5-cfq-xfs-1dd/debian-x86_64-2015-02-07.cgz/x86_64-rhel/a1a71cc4c0a53e29fe27cede9392b0ad816ee956/0
- LKP_SERVER=inn
- |2-
earlyprintk=ttyS0,115200 rd.udev.log-priority=err systemd.log_target=journal systemd.log_level=warning
debug apic=debug sysrq_always_enabled rcupdate.rcu_cpu_stall_timeout=100
panic=-1 softlockup_panic=1 nmi_watchdog=panic oops=panic load_ramdisk=2 prompt_ramdisk=0
console=ttyS0,115200 console=tty0 vga=normal
rw
max_uptime: 1500
lkp_initrd: "/lkp/lkp/lkp-x86_64.cgz"
modules_initrd: "/kernel/x86_64-rhel/a1a71cc4c0a53e29fe27cede9392b0ad816ee956/modules.cgz"
bm_initrd: "/osimage/deps/debian-x86_64-2015-02-07.cgz/lkp.cgz,/osimage/deps/debian-x86_64-2015-02-07.cgz/turbostat.cgz,/lkp/benchmarks/turbostat.cgz,/osimage/deps/debian-x86_64-2015-02-07.cgz/fs.cgz,/osimage/deps/debian-x86_64-2015-02-07.cgz/fs2.cgz"
job_state: finished
loadavg: 1.60 1.36 0.63 1/145 5859
start_time: '1429517927'
end_time: '1429518229'
version: "/lkp/lkp/.src-20150418-142223"
time_delta: '1429517881.362849165'
[-- Attachment #3: reproduce --]
[-- Type: text/plain, Size: 680 bytes --]
mdadm --stop /dev/md0
mdadm -q --create /dev/md0 --chunk=256 --level=raid5 --raid-devices=11 --force --assume-clean /dev/sdb /dev/sdg /dev/sdi /dev/sdh /dev/sdl /dev/sdf /dev/sdm /dev/sdk /dev/sdd /dev/sde /dev/sdc
mkfs -t xfs /dev/md0
mount -t xfs -o nobarrier,inode64 /dev/md0 /fs/md0
echo 1 > /sys/kernel/debug/tracing/events/writeback/balance_dirty_pages/enable
echo 1 > /sys/kernel/debug/tracing/events/writeback/bdi_dirty_ratelimit/enable
echo 1 > /sys/kernel/debug/tracing/events/writeback/global_dirty_state/enable
echo 1 > /sys/kernel/debug/tracing/events/writeback/writeback_single_inode/enable
dd if=/dev/zero of=/fs/md0/zero-1 status=noxfer &
sleep 300
killall -9 dd
[-- Attachment #4: Type: text/plain, Size: 89 bytes --]
_______________________________________________
LKP mailing list
LKP@linux.intel.com
\r
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [LKP] [RAID5] 878ee679279: -1.8% vmstat.io.bo, +40.5% perf-stat.LLC-load-misses
2015-04-23 6:55 [LKP] [RAID5] 878ee679279: -1.8% vmstat.io.bo, +40.5% perf-stat.LLC-load-misses Huang Ying
@ 2015-04-24 2:15 ` NeilBrown
2015-04-30 6:25 ` Yuanhan Liu
0 siblings, 1 reply; 3+ messages in thread
From: NeilBrown @ 2015-04-24 2:15 UTC (permalink / raw)
To: Huang Ying; +Cc: shli@kernel.org, LKML, LKP ML
[-- Attachment #1: Type: text/plain, Size: 15980 bytes --]
On Thu, 23 Apr 2015 14:55:59 +0800 Huang Ying <ying.huang@intel.com> wrote:
> FYI, we noticed the below changes on
>
> git://neil.brown.name/md for-next
> commit 878ee6792799e2f88bdcac329845efadb205252f ("RAID5: batch adjacent full stripe write")
Hi,
is there any chance that you could explain what some of this means?
There is lots of data and some very pretty graphs, but no explanation.
Which numbers are "good", which are "bad"? Which is "worst".
What do the graphs really show? and what would we like to see in them?
I think it is really great that you are doing this testing and reporting the
results. It's just so sad that I completely fail to understand them.
Thanks,
NeilBrown
>
>
> testbox/testcase/testparams: lkp-st02/dd-write/300-5m-11HDD-RAID5-cfq-xfs-1dd
>
> a87d7f782b47e030 878ee6792799e2f88bdcac3298
> ---------------- --------------------------
> %stddev %change %stddev
> \ | \
> 59035 ± 0% +18.4% 69913 ± 1% softirqs.SCHED
> 1330 ± 10% +17.4% 1561 ± 4% slabinfo.kmalloc-512.num_objs
> 1330 ± 10% +17.4% 1561 ± 4% slabinfo.kmalloc-512.active_objs
> 305908 ± 0% -1.8% 300427 ± 0% vmstat.io.bo
> 1 ± 0% +100.0% 2 ± 0% vmstat.procs.r
> 8266 ± 1% -15.7% 6968 ± 0% vmstat.system.cs
> 14819 ± 0% -2.1% 14503 ± 0% vmstat.system.in
> 18.20 ± 6% +10.2% 20.05 ± 4% perf-profile.cpu-cycles.raid_run_ops.handle_stripe.handle_active_stripes.raid5d.md_thread
> 1.94 ± 9% +90.6% 3.70 ± 9% perf-profile.cpu-cycles.async_xor.raid_run_ops.handle_stripe.handle_active_stripes.raid5d
> 0.00 ± 0% +Inf% 25.18 ± 3% perf-profile.cpu-cycles.handle_active_stripes.isra.45.raid5d.md_thread.kthread.ret_from_fork
> 0.00 ± 0% +Inf% 14.14 ± 4% perf-profile.cpu-cycles.async_copy_data.isra.42.raid_run_ops.handle_stripe.handle_active_stripes.raid5d
> 1.79 ± 7% +102.9% 3.64 ± 9% perf-profile.cpu-cycles.xor_blocks.async_xor.raid_run_ops.handle_stripe.handle_active_stripes
> 3.09 ± 4% -10.8% 2.76 ± 4% perf-profile.cpu-cycles.get_active_stripe.make_request.md_make_request.generic_make_request.submit_bio
> 0.80 ± 14% +28.1% 1.02 ± 10% perf-profile.cpu-cycles.mutex_lock.xfs_file_buffered_aio_write.xfs_file_write_iter.new_sync_write.vfs_write
> 14.78 ± 6% -100.0% 0.00 ± 0% perf-profile.cpu-cycles.async_copy_data.isra.38.raid_run_ops.handle_stripe.handle_active_stripes.raid5d
> 25.68 ± 4% -100.0% 0.00 ± 0% perf-profile.cpu-cycles.handle_active_stripes.isra.41.raid5d.md_thread.kthread.ret_from_fork
> 1.23 ± 5% +140.0% 2.96 ± 7% perf-profile.cpu-cycles.xor_sse_5_pf64.xor_blocks.async_xor.raid_run_ops.handle_stripe
> 2.62 ± 6% -95.6% 0.12 ± 33% perf-profile.cpu-cycles.analyse_stripe.handle_stripe.handle_active_stripes.raid5d.md_thread
> 0.96 ± 9% +17.5% 1.12 ± 2% perf-profile.cpu-cycles.xfs_ilock.xfs_file_buffered_aio_write.xfs_file_write_iter.new_sync_write.vfs_write
> 1.461e+10 ± 0% -5.3% 1.384e+10 ± 1% perf-stat.L1-dcache-load-misses
> 3.688e+11 ± 0% -2.7% 3.59e+11 ± 0% perf-stat.L1-dcache-loads
> 1.124e+09 ± 0% -27.7% 8.125e+08 ± 0% perf-stat.L1-dcache-prefetches
> 2.767e+10 ± 0% -1.8% 2.717e+10 ± 0% perf-stat.L1-dcache-store-misses
> 2.352e+11 ± 0% -2.8% 2.287e+11 ± 0% perf-stat.L1-dcache-stores
> 6.774e+09 ± 0% -2.3% 6.62e+09 ± 0% perf-stat.L1-icache-load-misses
> 5.571e+08 ± 0% +40.5% 7.826e+08 ± 1% perf-stat.LLC-load-misses
> 6.263e+09 ± 0% -13.7% 5.407e+09 ± 1% perf-stat.LLC-loads
> 1.914e+11 ± 0% -4.2% 1.833e+11 ± 0% perf-stat.branch-instructions
> 1.145e+09 ± 2% -5.6% 1.081e+09 ± 0% perf-stat.branch-load-misses
> 1.911e+11 ± 0% -4.3% 1.829e+11 ± 0% perf-stat.branch-loads
> 1.142e+09 ± 2% -5.1% 1.083e+09 ± 0% perf-stat.branch-misses
> 1.218e+09 ± 0% +19.8% 1.46e+09 ± 0% perf-stat.cache-misses
> 2.118e+10 ± 0% -5.2% 2.007e+10 ± 0% perf-stat.cache-references
> 2510308 ± 1% -15.7% 2115410 ± 0% perf-stat.context-switches
> 39623 ± 0% +22.1% 48370 ± 1% perf-stat.cpu-migrations
> 4.179e+08 ± 40% +165.7% 1.111e+09 ± 35% perf-stat.dTLB-load-misses
> 3.684e+11 ± 0% -2.5% 3.592e+11 ± 0% perf-stat.dTLB-loads
> 1.232e+08 ± 15% +62.5% 2.002e+08 ± 27% perf-stat.dTLB-store-misses
> 2.348e+11 ± 0% -2.5% 2.288e+11 ± 0% perf-stat.dTLB-stores
> 3577297 ± 2% +8.7% 3888986 ± 1% perf-stat.iTLB-load-misses
> 1.035e+12 ± 0% -3.5% 9.988e+11 ± 0% perf-stat.iTLB-loads
> 1.036e+12 ± 0% -3.7% 9.978e+11 ± 0% perf-stat.instructions
> 594 ± 30% +130.3% 1369 ± 13% sched_debug.cfs_rq[0]:/.blocked_load_avg
> 17 ± 10% -28.2% 12 ± 23% sched_debug.cfs_rq[0]:/.nr_spread_over
> 210 ± 21% +42.1% 298 ± 28% sched_debug.cfs_rq[0]:/.tg_runnable_contrib
> 9676 ± 21% +42.1% 13754 ± 28% sched_debug.cfs_rq[0]:/.avg->runnable_avg_sum
> 772 ± 25% +116.5% 1672 ± 9% sched_debug.cfs_rq[0]:/.tg_load_contrib
> 8402 ± 9% +83.3% 15405 ± 11% sched_debug.cfs_rq[0]:/.tg_load_avg
> 8356 ± 9% +82.8% 15272 ± 11% sched_debug.cfs_rq[1]:/.tg_load_avg
> 968 ± 25% +100.8% 1943 ± 14% sched_debug.cfs_rq[1]:/.blocked_load_avg
> 16242 ± 9% -22.2% 12643 ± 14% sched_debug.cfs_rq[1]:/.avg->runnable_avg_sum
> 353 ± 9% -22.1% 275 ± 14% sched_debug.cfs_rq[1]:/.tg_runnable_contrib
> 1183 ± 23% +77.7% 2102 ± 12% sched_debug.cfs_rq[1]:/.tg_load_contrib
> 181 ± 8% -31.4% 124 ± 26% sched_debug.cfs_rq[2]:/.tg_runnable_contrib
> 8364 ± 8% -31.3% 5745 ± 26% sched_debug.cfs_rq[2]:/.avg->runnable_avg_sum
> 8297 ± 9% +81.7% 15079 ± 12% sched_debug.cfs_rq[2]:/.tg_load_avg
> 30439 ± 13% -45.2% 16681 ± 26% sched_debug.cfs_rq[2]:/.exec_clock
> 39735 ± 14% -48.3% 20545 ± 29% sched_debug.cfs_rq[2]:/.min_vruntime
> 8231 ± 10% +82.2% 15000 ± 12% sched_debug.cfs_rq[3]:/.tg_load_avg
> 1210 ± 14% +110.3% 2546 ± 30% sched_debug.cfs_rq[4]:/.tg_load_contrib
> 8188 ± 10% +82.8% 14964 ± 12% sched_debug.cfs_rq[4]:/.tg_load_avg
> 8132 ± 10% +83.1% 14890 ± 12% sched_debug.cfs_rq[5]:/.tg_load_avg
> 749 ± 29% +205.9% 2292 ± 34% sched_debug.cfs_rq[5]:/.blocked_load_avg
> 963 ± 30% +169.9% 2599 ± 33% sched_debug.cfs_rq[5]:/.tg_load_contrib
> 37791 ± 32% -38.6% 23209 ± 13% sched_debug.cfs_rq[6]:/.min_vruntime
> 693 ± 25% +132.2% 1609 ± 29% sched_debug.cfs_rq[6]:/.blocked_load_avg
> 10838 ± 13% -39.2% 6587 ± 13% sched_debug.cfs_rq[6]:/.avg->runnable_avg_sum
> 29329 ± 27% -33.2% 19577 ± 10% sched_debug.cfs_rq[6]:/.exec_clock
> 235 ± 14% -39.7% 142 ± 14% sched_debug.cfs_rq[6]:/.tg_runnable_contrib
> 8085 ± 10% +83.6% 14848 ± 12% sched_debug.cfs_rq[6]:/.tg_load_avg
> 839 ± 25% +128.5% 1917 ± 18% sched_debug.cfs_rq[6]:/.tg_load_contrib
> 8051 ± 10% +83.6% 14779 ± 12% sched_debug.cfs_rq[7]:/.tg_load_avg
> 156 ± 34% +97.9% 309 ± 19% sched_debug.cpu#0.cpu_load[4]
> 160 ± 25% +64.0% 263 ± 16% sched_debug.cpu#0.cpu_load[2]
> 156 ± 32% +83.7% 286 ± 17% sched_debug.cpu#0.cpu_load[3]
> 164 ± 20% -35.1% 106 ± 31% sched_debug.cpu#2.cpu_load[0]
> 249 ± 15% +80.2% 449 ± 10% sched_debug.cpu#4.cpu_load[3]
> 231 ± 11% +101.2% 466 ± 13% sched_debug.cpu#4.cpu_load[2]
> 217 ± 14% +189.9% 630 ± 38% sched_debug.cpu#4.cpu_load[0]
> 71951 ± 5% +21.6% 87526 ± 7% sched_debug.cpu#4.nr_load_updates
> 214 ± 8% +146.1% 527 ± 27% sched_debug.cpu#4.cpu_load[1]
> 256 ± 17% +75.7% 449 ± 13% sched_debug.cpu#4.cpu_load[4]
> 209 ± 23% +98.3% 416 ± 48% sched_debug.cpu#5.cpu_load[2]
> 68024 ± 2% +18.8% 80825 ± 1% sched_debug.cpu#5.nr_load_updates
> 217 ± 26% +74.9% 380 ± 45% sched_debug.cpu#5.cpu_load[3]
> 852 ± 21% -38.3% 526 ± 22% sched_debug.cpu#6.curr->pid
>
> lkp-st02: Core2
> Memory: 8G
>
>
>
>
> perf-stat.cache-misses
>
> 1.6e+09 O+-----O--O---O--O---O--------------------------------------------+
> | O O O O O O O O O O |
> 1.4e+09 ++ |
> 1.2e+09 *+.*...* *..* * *...*..*...*..*...*..*...*..*...*..*
> | : : : : : |
> 1e+09 ++ : : : : : : |
> | : : : : : : |
> 8e+08 ++ : : : : : : |
> | : : : : : : |
> 6e+08 ++ : : : : : : |
> 4e+08 ++ : : : : : : |
> | : : : : : : |
> 2e+08 ++ : : : : : : |
> | : : : |
> 0 ++-O------*----------*------*-------------------------------------+
>
>
> perf-stat.L1-dcache-prefetches
>
> 1.2e+09 ++----------------------------------------------------------------+
> *..*...* *..* * ..*.. ..*..*...*..*...*..*...*..*
> 1e+09 ++ : : : : *. *. |
> | : : : :: : |
> | : : : : : : O |
> 8e+08 O+ O: O :O O: O :O: O :O O O O O O O |
> | : : : : : : |
> 6e+08 ++ : : : : : : |
> | : : : : : : |
> 4e+08 ++ : : : : : : |
> | : : : : : : |
> | : : : : : : |
> 2e+08 ++ :: :: : : |
> | : : : |
> 0 ++-O------*----------*------*-------------------------------------+
>
>
> perf-stat.LLC-load-misses
>
> 1e+09 ++------------------------------------------------------------------+
> 9e+08 O+ O O O O O |
> | O O O O |
> 8e+08 ++ O O O O O O |
> 7e+08 ++ |
> | |
> 6e+08 *+..*..* *...* * *...*..*...*...*..*...*..*...*..*...*
> 5e+08 ++ : : : :: : |
> 4e+08 ++ : : : : : : |
> | : : : : : : |
> 3e+08 ++ : : : : : : |
> 2e+08 ++ : : : : : : |
> | : : : : : : |
> 1e+08 ++ : :: : |
> 0 ++--O------*---------*-------*--------------------------------------+
>
>
> perf-stat.context-switches
>
> 3e+06 ++----------------------------------------------------------------+
> | *...*..*... |
> 2.5e+06 *+.*...* *..* * : *..*... .*...*..*... .*
> | : : : : : *. *. |
> O O: O :O O: O :: : O O O O O O |
> 2e+06 ++ : : : :O: O :O O |
> | : : : : : : |
> 1.5e+06 ++ : : : : : : |
> | : : : : : : |
> 1e+06 ++ : : : : : : |
> | : : : : : : |
> | : : : : : : |
> 500000 ++ :: : : :: |
> | : : : |
> 0 ++-O------*----------*------*-------------------------------------+
>
>
> vmstat.system.cs
>
> 10000 ++------------------------------------------------------------------+
> 9000 ++ *...*.. |
> *...*..* *...* * : *...*...*.. ..*..*...*.. ..*
> 8000 ++ : : : : : *. *. |
> 7000 O+ O: O O O: O : : : O O O O O O |
> | : : : :O: O :O O |
> 6000 ++ : : : : : : |
> 5000 ++ : : : : : : |
> 4000 ++ : : : : : : |
> | : : : : : : |
> 3000 ++ : : : : : : |
> 2000 ++ : : : : : : |
> | : : :: :: |
> 1000 ++ : : : |
> 0 ++--O------*---------*-------*--------------------------------------+
>
>
> [*] bisect-good sample
> [O] bisect-bad sample
>
> To reproduce:
>
> apt-get install ruby
> git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
> cd lkp-tests
> bin/setup-local job.yaml # the job file attached in this email
> bin/run-local job.yaml
>
>
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
>
>
> Thanks,
> Ying Huang
>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [LKP] [RAID5] 878ee679279: -1.8% vmstat.io.bo, +40.5% perf-stat.LLC-load-misses
2015-04-24 2:15 ` NeilBrown
@ 2015-04-30 6:25 ` Yuanhan Liu
0 siblings, 0 replies; 3+ messages in thread
From: Yuanhan Liu @ 2015-04-30 6:25 UTC (permalink / raw)
To: NeilBrown; +Cc: Huang Ying, shli@kernel.org, LKML, LKP ML, Fengguang Wu
On Fri, Apr 24, 2015 at 12:15:59PM +1000, NeilBrown wrote:
> On Thu, 23 Apr 2015 14:55:59 +0800 Huang Ying <ying.huang@intel.com> wrote:
>
> > FYI, we noticed the below changes on
> >
> > git://neil.brown.name/md for-next
> > commit 878ee6792799e2f88bdcac329845efadb205252f ("RAID5: batch adjacent full stripe write")
>
> Hi,
> is there any chance that you could explain what some of this means?
> There is lots of data and some very pretty graphs, but no explanation.
Hi Neil,
(Sorry for late response: Ying is on vacation)
I guess you can simply ignore this report, as I already reported to you
month ago that this patch made fsmark performs better in most cases:
https://lists.01.org/pipermail/lkp/2015-March/002411.html
>
> Which numbers are "good", which are "bad"? Which is "worst".
> What do the graphs really show? and what would we like to see in them?
>
> I think it is really great that you are doing this testing and reporting the
> results. It's just so sad that I completely fail to understand them.
Sorry, it's our bad to make them hard to understand as well as
to report a duplicate one(well, the commit hash is different ;).
We might need take some time to make those data understood easier.
--yliu
>
> >
> >
> > testbox/testcase/testparams: lkp-st02/dd-write/300-5m-11HDD-RAID5-cfq-xfs-1dd
> >
> > a87d7f782b47e030 878ee6792799e2f88bdcac3298
> > ---------------- --------------------------
> > %stddev %change %stddev
> > \ | \
> > 59035 ± 0% +18.4% 69913 ± 1% softirqs.SCHED
> > 1330 ± 10% +17.4% 1561 ± 4% slabinfo.kmalloc-512.num_objs
> > 1330 ± 10% +17.4% 1561 ± 4% slabinfo.kmalloc-512.active_objs
> > 305908 ± 0% -1.8% 300427 ± 0% vmstat.io.bo
> > 1 ± 0% +100.0% 2 ± 0% vmstat.procs.r
> > 8266 ± 1% -15.7% 6968 ± 0% vmstat.system.cs
> > 14819 ± 0% -2.1% 14503 ± 0% vmstat.system.in
> > 18.20 ± 6% +10.2% 20.05 ± 4% perf-profile.cpu-cycles.raid_run_ops.handle_stripe.handle_active_stripes.raid5d.md_thread
> > 1.94 ± 9% +90.6% 3.70 ± 9% perf-profile.cpu-cycles.async_xor.raid_run_ops.handle_stripe.handle_active_stripes.raid5d
> > 0.00 ± 0% +Inf% 25.18 ± 3% perf-profile.cpu-cycles.handle_active_stripes.isra.45.raid5d.md_thread.kthread.ret_from_fork
> > 0.00 ± 0% +Inf% 14.14 ± 4% perf-profile.cpu-cycles.async_copy_data.isra.42.raid_run_ops.handle_stripe.handle_active_stripes.raid5d
> > 1.79 ± 7% +102.9% 3.64 ± 9% perf-profile.cpu-cycles.xor_blocks.async_xor.raid_run_ops.handle_stripe.handle_active_stripes
> > 3.09 ± 4% -10.8% 2.76 ± 4% perf-profile.cpu-cycles.get_active_stripe.make_request.md_make_request.generic_make_request.submit_bio
> > 0.80 ± 14% +28.1% 1.02 ± 10% perf-profile.cpu-cycles.mutex_lock.xfs_file_buffered_aio_write.xfs_file_write_iter.new_sync_write.vfs_write
> > 14.78 ± 6% -100.0% 0.00 ± 0% perf-profile.cpu-cycles.async_copy_data.isra.38.raid_run_ops.handle_stripe.handle_active_stripes.raid5d
> > 25.68 ± 4% -100.0% 0.00 ± 0% perf-profile.cpu-cycles.handle_active_stripes.isra.41.raid5d.md_thread.kthread.ret_from_fork
> > 1.23 ± 5% +140.0% 2.96 ± 7% perf-profile.cpu-cycles.xor_sse_5_pf64.xor_blocks.async_xor.raid_run_ops.handle_stripe
> > 2.62 ± 6% -95.6% 0.12 ± 33% perf-profile.cpu-cycles.analyse_stripe.handle_stripe.handle_active_stripes.raid5d.md_thread
> > 0.96 ± 9% +17.5% 1.12 ± 2% perf-profile.cpu-cycles.xfs_ilock.xfs_file_buffered_aio_write.xfs_file_write_iter.new_sync_write.vfs_write
> > 1.461e+10 ± 0% -5.3% 1.384e+10 ± 1% perf-stat.L1-dcache-load-misses
> > 3.688e+11 ± 0% -2.7% 3.59e+11 ± 0% perf-stat.L1-dcache-loads
> > 1.124e+09 ± 0% -27.7% 8.125e+08 ± 0% perf-stat.L1-dcache-prefetches
> > 2.767e+10 ± 0% -1.8% 2.717e+10 ± 0% perf-stat.L1-dcache-store-misses
> > 2.352e+11 ± 0% -2.8% 2.287e+11 ± 0% perf-stat.L1-dcache-stores
> > 6.774e+09 ± 0% -2.3% 6.62e+09 ± 0% perf-stat.L1-icache-load-misses
> > 5.571e+08 ± 0% +40.5% 7.826e+08 ± 1% perf-stat.LLC-load-misses
> > 6.263e+09 ± 0% -13.7% 5.407e+09 ± 1% perf-stat.LLC-loads
> > 1.914e+11 ± 0% -4.2% 1.833e+11 ± 0% perf-stat.branch-instructions
> > 1.145e+09 ± 2% -5.6% 1.081e+09 ± 0% perf-stat.branch-load-misses
> > 1.911e+11 ± 0% -4.3% 1.829e+11 ± 0% perf-stat.branch-loads
> > 1.142e+09 ± 2% -5.1% 1.083e+09 ± 0% perf-stat.branch-misses
> > 1.218e+09 ± 0% +19.8% 1.46e+09 ± 0% perf-stat.cache-misses
> > 2.118e+10 ± 0% -5.2% 2.007e+10 ± 0% perf-stat.cache-references
> > 2510308 ± 1% -15.7% 2115410 ± 0% perf-stat.context-switches
> > 39623 ± 0% +22.1% 48370 ± 1% perf-stat.cpu-migrations
> > 4.179e+08 ± 40% +165.7% 1.111e+09 ± 35% perf-stat.dTLB-load-misses
> > 3.684e+11 ± 0% -2.5% 3.592e+11 ± 0% perf-stat.dTLB-loads
> > 1.232e+08 ± 15% +62.5% 2.002e+08 ± 27% perf-stat.dTLB-store-misses
> > 2.348e+11 ± 0% -2.5% 2.288e+11 ± 0% perf-stat.dTLB-stores
> > 3577297 ± 2% +8.7% 3888986 ± 1% perf-stat.iTLB-load-misses
> > 1.035e+12 ± 0% -3.5% 9.988e+11 ± 0% perf-stat.iTLB-loads
> > 1.036e+12 ± 0% -3.7% 9.978e+11 ± 0% perf-stat.instructions
> > 594 ± 30% +130.3% 1369 ± 13% sched_debug.cfs_rq[0]:/.blocked_load_avg
> > 17 ± 10% -28.2% 12 ± 23% sched_debug.cfs_rq[0]:/.nr_spread_over
> > 210 ± 21% +42.1% 298 ± 28% sched_debug.cfs_rq[0]:/.tg_runnable_contrib
> > 9676 ± 21% +42.1% 13754 ± 28% sched_debug.cfs_rq[0]:/.avg->runnable_avg_sum
> > 772 ± 25% +116.5% 1672 ± 9% sched_debug.cfs_rq[0]:/.tg_load_contrib
> > 8402 ± 9% +83.3% 15405 ± 11% sched_debug.cfs_rq[0]:/.tg_load_avg
> > 8356 ± 9% +82.8% 15272 ± 11% sched_debug.cfs_rq[1]:/.tg_load_avg
> > 968 ± 25% +100.8% 1943 ± 14% sched_debug.cfs_rq[1]:/.blocked_load_avg
> > 16242 ± 9% -22.2% 12643 ± 14% sched_debug.cfs_rq[1]:/.avg->runnable_avg_sum
> > 353 ± 9% -22.1% 275 ± 14% sched_debug.cfs_rq[1]:/.tg_runnable_contrib
> > 1183 ± 23% +77.7% 2102 ± 12% sched_debug.cfs_rq[1]:/.tg_load_contrib
> > 181 ± 8% -31.4% 124 ± 26% sched_debug.cfs_rq[2]:/.tg_runnable_contrib
> > 8364 ± 8% -31.3% 5745 ± 26% sched_debug.cfs_rq[2]:/.avg->runnable_avg_sum
> > 8297 ± 9% +81.7% 15079 ± 12% sched_debug.cfs_rq[2]:/.tg_load_avg
> > 30439 ± 13% -45.2% 16681 ± 26% sched_debug.cfs_rq[2]:/.exec_clock
> > 39735 ± 14% -48.3% 20545 ± 29% sched_debug.cfs_rq[2]:/.min_vruntime
> > 8231 ± 10% +82.2% 15000 ± 12% sched_debug.cfs_rq[3]:/.tg_load_avg
> > 1210 ± 14% +110.3% 2546 ± 30% sched_debug.cfs_rq[4]:/.tg_load_contrib
> > 8188 ± 10% +82.8% 14964 ± 12% sched_debug.cfs_rq[4]:/.tg_load_avg
> > 8132 ± 10% +83.1% 14890 ± 12% sched_debug.cfs_rq[5]:/.tg_load_avg
> > 749 ± 29% +205.9% 2292 ± 34% sched_debug.cfs_rq[5]:/.blocked_load_avg
> > 963 ± 30% +169.9% 2599 ± 33% sched_debug.cfs_rq[5]:/.tg_load_contrib
> > 37791 ± 32% -38.6% 23209 ± 13% sched_debug.cfs_rq[6]:/.min_vruntime
> > 693 ± 25% +132.2% 1609 ± 29% sched_debug.cfs_rq[6]:/.blocked_load_avg
> > 10838 ± 13% -39.2% 6587 ± 13% sched_debug.cfs_rq[6]:/.avg->runnable_avg_sum
> > 29329 ± 27% -33.2% 19577 ± 10% sched_debug.cfs_rq[6]:/.exec_clock
> > 235 ± 14% -39.7% 142 ± 14% sched_debug.cfs_rq[6]:/.tg_runnable_contrib
> > 8085 ± 10% +83.6% 14848 ± 12% sched_debug.cfs_rq[6]:/.tg_load_avg
> > 839 ± 25% +128.5% 1917 ± 18% sched_debug.cfs_rq[6]:/.tg_load_contrib
> > 8051 ± 10% +83.6% 14779 ± 12% sched_debug.cfs_rq[7]:/.tg_load_avg
> > 156 ± 34% +97.9% 309 ± 19% sched_debug.cpu#0.cpu_load[4]
> > 160 ± 25% +64.0% 263 ± 16% sched_debug.cpu#0.cpu_load[2]
> > 156 ± 32% +83.7% 286 ± 17% sched_debug.cpu#0.cpu_load[3]
> > 164 ± 20% -35.1% 106 ± 31% sched_debug.cpu#2.cpu_load[0]
> > 249 ± 15% +80.2% 449 ± 10% sched_debug.cpu#4.cpu_load[3]
> > 231 ± 11% +101.2% 466 ± 13% sched_debug.cpu#4.cpu_load[2]
> > 217 ± 14% +189.9% 630 ± 38% sched_debug.cpu#4.cpu_load[0]
> > 71951 ± 5% +21.6% 87526 ± 7% sched_debug.cpu#4.nr_load_updates
> > 214 ± 8% +146.1% 527 ± 27% sched_debug.cpu#4.cpu_load[1]
> > 256 ± 17% +75.7% 449 ± 13% sched_debug.cpu#4.cpu_load[4]
> > 209 ± 23% +98.3% 416 ± 48% sched_debug.cpu#5.cpu_load[2]
> > 68024 ± 2% +18.8% 80825 ± 1% sched_debug.cpu#5.nr_load_updates
> > 217 ± 26% +74.9% 380 ± 45% sched_debug.cpu#5.cpu_load[3]
> > 852 ± 21% -38.3% 526 ± 22% sched_debug.cpu#6.curr->pid
> >
> > lkp-st02: Core2
> > Memory: 8G
> >
> >
> >
> >
> > perf-stat.cache-misses
> >
> > 1.6e+09 O+-----O--O---O--O---O--------------------------------------------+
> > | O O O O O O O O O O |
> > 1.4e+09 ++ |
> > 1.2e+09 *+.*...* *..* * *...*..*...*..*...*..*...*..*...*..*
> > | : : : : : |
> > 1e+09 ++ : : : : : : |
> > | : : : : : : |
> > 8e+08 ++ : : : : : : |
> > | : : : : : : |
> > 6e+08 ++ : : : : : : |
> > 4e+08 ++ : : : : : : |
> > | : : : : : : |
> > 2e+08 ++ : : : : : : |
> > | : : : |
> > 0 ++-O------*----------*------*-------------------------------------+
> >
> >
> > perf-stat.L1-dcache-prefetches
> >
> > 1.2e+09 ++----------------------------------------------------------------+
> > *..*...* *..* * ..*.. ..*..*...*..*...*..*...*..*
> > 1e+09 ++ : : : : *. *. |
> > | : : : :: : |
> > | : : : : : : O |
> > 8e+08 O+ O: O :O O: O :O: O :O O O O O O O |
> > | : : : : : : |
> > 6e+08 ++ : : : : : : |
> > | : : : : : : |
> > 4e+08 ++ : : : : : : |
> > | : : : : : : |
> > | : : : : : : |
> > 2e+08 ++ :: :: : : |
> > | : : : |
> > 0 ++-O------*----------*------*-------------------------------------+
> >
> >
> > perf-stat.LLC-load-misses
> >
> > 1e+09 ++------------------------------------------------------------------+
> > 9e+08 O+ O O O O O |
> > | O O O O |
> > 8e+08 ++ O O O O O O |
> > 7e+08 ++ |
> > | |
> > 6e+08 *+..*..* *...* * *...*..*...*...*..*...*..*...*..*...*
> > 5e+08 ++ : : : :: : |
> > 4e+08 ++ : : : : : : |
> > | : : : : : : |
> > 3e+08 ++ : : : : : : |
> > 2e+08 ++ : : : : : : |
> > | : : : : : : |
> > 1e+08 ++ : :: : |
> > 0 ++--O------*---------*-------*--------------------------------------+
> >
> >
> > perf-stat.context-switches
> >
> > 3e+06 ++----------------------------------------------------------------+
> > | *...*..*... |
> > 2.5e+06 *+.*...* *..* * : *..*... .*...*..*... .*
> > | : : : : : *. *. |
> > O O: O :O O: O :: : O O O O O O |
> > 2e+06 ++ : : : :O: O :O O |
> > | : : : : : : |
> > 1.5e+06 ++ : : : : : : |
> > | : : : : : : |
> > 1e+06 ++ : : : : : : |
> > | : : : : : : |
> > | : : : : : : |
> > 500000 ++ :: : : :: |
> > | : : : |
> > 0 ++-O------*----------*------*-------------------------------------+
> >
> >
> > vmstat.system.cs
> >
> > 10000 ++------------------------------------------------------------------+
> > 9000 ++ *...*.. |
> > *...*..* *...* * : *...*...*.. ..*..*...*.. ..*
> > 8000 ++ : : : : : *. *. |
> > 7000 O+ O: O O O: O : : : O O O O O O |
> > | : : : :O: O :O O |
> > 6000 ++ : : : : : : |
> > 5000 ++ : : : : : : |
> > 4000 ++ : : : : : : |
> > | : : : : : : |
> > 3000 ++ : : : : : : |
> > 2000 ++ : : : : : : |
> > | : : :: :: |
> > 1000 ++ : : : |
> > 0 ++--O------*---------*-------*--------------------------------------+
> >
> >
> > [*] bisect-good sample
> > [O] bisect-bad sample
> >
> > To reproduce:
> >
> > apt-get install ruby
> > git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
> > cd lkp-tests
> > bin/setup-local job.yaml # the job file attached in this email
> > bin/run-local job.yaml
> >
> >
> > Disclaimer:
> > Results have been estimated based on internal Intel analysis and are provided
> > for informational purposes only. Any difference in system hardware or software
> > design or configuration may affect actual performance.
> >
> >
> > Thanks,
> > Ying Huang
> >
>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2015-04-30 6:23 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-04-23 6:55 [LKP] [RAID5] 878ee679279: -1.8% vmstat.io.bo, +40.5% perf-stat.LLC-load-misses Huang Ying
2015-04-24 2:15 ` NeilBrown
2015-04-30 6:25 ` Yuanhan Liu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox