From: kernel test robot <oliver.sang@intel.com>
To: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: feng.tang@intel.com, intel-gfx@lists.freedesktop.org,
Rodrigo Vivi <rodrigo.vivi@intel.com>,
fengwei.yin@intel.com, dri-devel@lists.freedesktop.org,
oliver.sang@intel.com, ying.huang@intel.com,
oe-lkp@lists.linux.dev
Subject: Re: [Intel-gfx] [PATCH] drm/i915/gem: Allow users to disable waitboost
Date: Tue, 26 Sep 2023 10:58:47 +0800 [thread overview]
Message-ID: <202309261055.b74df987-oliver.sang@intel.com> (raw)
In-Reply-To: <20230920215624.3482244-1-vinay.belgaumkar@intel.com>
Hello,
kernel test robot noticed a -3.2% regression of phoronix-test-suite.paraview.WaveletContour.1024x768.mipolys___sec on:
commit: 54fef7ea35dadd66193b98805b0bc42ef2b279db ("[PATCH] drm/i915/gem: Allow users to disable waitboost")
url: https://github.com/intel-lab-lkp/linux/commits/Vinay-Belgaumkar/drm-i915-gem-Allow-users-to-disable-waitboost/20230921-060357
base: git://anongit.freedesktop.org/drm-intel for-linux-next
patch link: https://lore.kernel.org/all/20230920215624.3482244-1-vinay.belgaumkar@intel.com/
patch subject: [PATCH] drm/i915/gem: Allow users to disable waitboost
testcase: phoronix-test-suite
test machine: 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 32G memory
parameters:
need_x: true
test: paraview-1.0.2
option_a: Wavelet Contour
option_b: 1024 x 768
cpufreq_governor: performance
In addition to that, the commit also has significant impact on the following tests:
+------------------+----------------------------------------------------------------------------------------------------------------+
| testcase: change | phoronix-test-suite: phoronix-test-suite.x11perf.PutImageXY500x500Square.operations___second 12.8% improvement |
| test machine | 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 32G memory |
| test parameters | cpufreq_governor=performance |
| | need_x=true |
| | option_a=PutImage XY 500x500 Square |
| | test=x11perf-1.1.1 |
+------------------+----------------------------------------------------------------------------------------------------------------+
| testcase: change | phoronix-test-suite: phoronix-test-suite.openarena.2560x1440.milliseconds -12.2% regression |
| test machine | 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 32G memory |
| test parameters | cpufreq_governor=performance |
| | need_x=true |
| | option_a=2560 x 1440 |
| | test=openarena-1.5.5 |
+------------------+----------------------------------------------------------------------------------------------------------------+
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202309261055.b74df987-oliver.sang@intel.com
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20230926/202309261055.b74df987-oliver.sang@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/need_x/option_a/option_b/rootfs/tbox_group/test/testcase:
gcc-12/performance/x86_64-rhel-8.3/true/Wavelet Contour/1024 x 768/debian-x86_64-phoronix/lkp-cfl-d2/paraview-1.0.2/phoronix-test-suite
commit:
16a9359401 ("drm/i915: Implement transcoder LRR for TGL+")
54fef7ea35 ("drm/i915/gem: Allow users to disable waitboost")
16a9359401edcbc0 54fef7ea35dadd66193b98805b0
---------------- ---------------------------
%stddev %change %stddev
\ | \
0.05 ± 4% +0.0 0.06 ± 2% mpstat.cpu.all.soft%
66.17 ± 60% +145.6% 162.50 ± 33% turbostat.C10
28.61 -3.3% 27.68 phoronix-test-suite.paraview.WaveletContour.1024x768.frames___sec
298.15 -3.2% 288.49 phoronix-test-suite.paraview.WaveletContour.1024x768.mipolys___sec
535005 +8.6% 580810 phoronix-test-suite.time.minor_page_faults
6278 +8.3% 6797 phoronix-test-suite.time.voluntary_context_switches
801166 +5.6% 845675 proc-vmstat.numa_hit
799382 +5.1% 840353 proc-vmstat.numa_local
59648 +2.6% 61211 proc-vmstat.pgactivate
1539307 +2.7% 1580759 proc-vmstat.pgalloc_normal
734297 +6.8% 783862 proc-vmstat.pgfault
1343231 +3.1% 1385353 proc-vmstat.pgfree
39042 +3.7% 40480 ± 2% proc-vmstat.pgreuse
1.106e+08 +2.1% 1.129e+08 perf-stat.i.cache-references
4.255e+09 +1.9% 4.336e+09 perf-stat.i.cpu-cycles
147872 +2.4% 151392 perf-stat.i.dTLB-store-misses
230242 +2.2% 235419 perf-stat.i.iTLB-load-misses
569455 +2.4% 583234 perf-stat.i.iTLB-loads
0.35 +1.9% 0.36 perf-stat.i.metric.GHz
12547 +8.6% 13625 perf-stat.i.minor-faults
609443 ± 2% +4.0% 633739 ± 2% perf-stat.i.node-loads
1701083 ± 2% +5.0% 1786794 ± 2% perf-stat.i.node-stores
12566 +8.6% 13644 perf-stat.i.page-faults
1.085e+08 +2.1% 1.108e+08 perf-stat.ps.cache-references
4.179e+09 +1.9% 4.259e+09 perf-stat.ps.cpu-cycles
225995 +2.3% 231096 perf-stat.ps.iTLB-load-misses
558694 +2.4% 572272 perf-stat.ps.iTLB-loads
12320 +8.6% 13380 perf-stat.ps.minor-faults
598059 ± 2% +4.0% 621901 ± 2% perf-stat.ps.node-loads
1670891 ± 2% +5.0% 1754927 ± 2% perf-stat.ps.node-stores
12339 +8.6% 13399 perf-stat.ps.page-faults
14.86 ± 42% -11.7 3.12 ±152% perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
14.86 ± 42% -11.7 3.12 ±152% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
14.86 ± 42% -11.7 3.12 ±152% perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64_no_verify
14.86 ± 42% -11.0 3.82 ±161% perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify
11.53 ± 67% -10.1 1.39 ±223% perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary
11.53 ± 67% -9.4 2.08 ±223% perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry
11.53 ± 67% -9.1 2.43 ±143% perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
6.02 ± 95% -5.3 0.70 ±223% perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
5.51 ± 56% -4.8 0.70 ±223% perf-profile.calltrace.cycles-pp.intel_idle_ibrs.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
7.02 ±128% -2.6 4.44 ±147% perf-profile.calltrace.cycles-pp.do_group_exit.get_signal.arch_do_signal_or_restart.exit_to_user_mode_loop.exit_to_user_mode_prepare
7.02 ±128% -2.6 4.44 ±147% perf-profile.calltrace.cycles-pp.do_exit.do_group_exit.get_signal.arch_do_signal_or_restart.exit_to_user_mode_loop
7.02 ±128% -2.6 4.44 ±147% perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
7.02 ±128% -2.6 4.44 ±147% perf-profile.calltrace.cycles-pp.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
7.02 ±128% -2.6 4.44 ±147% perf-profile.calltrace.cycles-pp.arch_do_signal_or_restart.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64
7.02 ±128% -2.6 4.44 ±147% perf-profile.calltrace.cycles-pp.get_signal.arch_do_signal_or_restart.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode
14.86 ± 42% -11.7 3.12 ±152% perf-profile.children.cycles-pp.start_secondary
14.86 ± 42% -11.0 3.82 ±161% perf-profile.children.cycles-pp.secondary_startup_64_no_verify
14.86 ± 42% -11.0 3.82 ±161% perf-profile.children.cycles-pp.cpu_startup_entry
14.86 ± 42% -11.0 3.82 ±161% perf-profile.children.cycles-pp.do_idle
11.53 ± 67% -9.4 2.08 ±223% perf-profile.children.cycles-pp.cpuidle_enter
11.53 ± 67% -9.4 2.08 ±223% perf-profile.children.cycles-pp.cpuidle_enter_state
11.53 ± 67% -8.4 3.12 ±152% perf-profile.children.cycles-pp.cpuidle_idle_call
6.02 ± 95% -5.3 0.70 ±223% perf-profile.children.cycles-pp.intel_idle
5.51 ± 56% -4.8 0.70 ±223% perf-profile.children.cycles-pp.intel_idle_ibrs
7.02 ±128% -2.6 4.44 ±147% perf-profile.children.cycles-pp.exit_to_user_mode_prepare
7.02 ±128% -2.6 4.44 ±147% perf-profile.children.cycles-pp.exit_to_user_mode_loop
7.02 ±128% -2.6 4.44 ±147% perf-profile.children.cycles-pp.arch_do_signal_or_restart
7.02 ±128% -2.6 4.44 ±147% perf-profile.children.cycles-pp.get_signal
6.02 ± 95% -5.3 0.70 ±223% perf-profile.self.cycles-pp.intel_idle
5.51 ± 56% -4.8 0.70 ±223% perf-profile.self.cycles-pp.intel_idle_ibrs
***************************************************************************************************
lkp-cfl-d2: 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 32G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/need_x/option_a/rootfs/tbox_group/test/testcase:
gcc-12/performance/x86_64-rhel-8.3/true/PutImage XY 500x500 Square/debian-x86_64-phoronix/lkp-cfl-d2/x11perf-1.1.1/phoronix-test-suite
commit:
16a9359401 ("drm/i915: Implement transcoder LRR for TGL+")
54fef7ea35 ("drm/i915/gem: Allow users to disable waitboost")
16a9359401edcbc0 54fef7ea35dadd66193b98805b0
---------------- ---------------------------
%stddev %change %stddev
\ | \
11950 ± 3% +93.9% 23173 ± 3% meminfo.Unevictable
1.02 ± 3% -0.6 0.43 ± 3% mpstat.cpu.all.iowait%
21273 -5.3% 20141 vmstat.system.in
4887 ± 32% -67.7% 1579 ± 32% phoronix-test-suite.time.involuntary_context_switches
147212 +26.1% 185677 phoronix-test-suite.time.minor_page_faults
76.83 +10.6% 85.00 phoronix-test-suite.time.percent_of_cpu_this_job_got
96.48 ± 4% +15.0% 110.93 ± 2% phoronix-test-suite.time.user_time
106.83 +12.8% 120.50 phoronix-test-suite.x11perf.PutImageXY500x500Square.operations___second
3.73 ± 3% -0.6 3.12 ± 2% turbostat.C1E%
58316 ± 5% -20.6% 46326 ± 3% turbostat.C3
0.86 ± 2% -0.1 0.74 ± 3% turbostat.C3%
1.24 ± 8% -18.9% 1.01 ± 10% turbostat.CPU%c3
38.67 ± 2% +5.0 43.70 ± 2% turbostat.CPUGFX%
23.44 +4.0% 24.38 turbostat.CorWatt
2.30 ± 4% -71.6% 0.65 ± 2% turbostat.GFXWatt
26.19 -2.7% 25.49 turbostat.PkgWatt
1.41 +2.4% 1.44 turbostat.RAMWatt
61.83 ± 23% -20.2 41.62 ± 31% perf-profile.calltrace.cycles-pp.intel_idle_ibrs.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
61.85 ± 23% -20.2 41.64 ± 31% perf-profile.children.cycles-pp.intel_idle_ibrs
1.24 ± 59% -1.0 0.23 ± 27% perf-profile.children.cycles-pp.worker_thread
0.84 ± 60% -0.7 0.18 ± 48% perf-profile.children.cycles-pp.asm_common_interrupt
0.83 ± 61% -0.7 0.18 ± 48% perf-profile.children.cycles-pp.common_interrupt
0.35 ± 74% -0.3 0.06 ± 85% perf-profile.children.cycles-pp.__common_interrupt
0.35 ± 75% -0.3 0.06 ± 85% perf-profile.children.cycles-pp.handle_edge_irq
0.30 ± 78% -0.3 0.05 ± 81% perf-profile.children.cycles-pp.handle_irq_event
0.37 ± 18% -0.2 0.18 ± 46% perf-profile.children.cycles-pp.newidle_balance
0.12 ± 32% -0.1 0.04 ± 79% perf-profile.children.cycles-pp.exit_mm
0.12 ± 30% -0.1 0.05 ± 73% perf-profile.children.cycles-pp.native_apic_msr_eoi
0.03 ±144% +0.1 0.10 ± 38% perf-profile.children.cycles-pp.dma_resv_iter_first_unlocked
0.04 ±154% +0.1 0.19 ± 40% perf-profile.children.cycles-pp.i915_gem_busy_ioctl
61.82 ± 23% -20.2 41.63 ± 31% perf-profile.self.cycles-pp.intel_idle_ibrs
0.12 ± 26% -0.1 0.05 ± 74% perf-profile.self.cycles-pp.native_apic_msr_eoi
31861 +5.9% 33748 proc-vmstat.nr_active_file
29007 +7.6% 31200 proc-vmstat.nr_mapped
183598 +1.6% 186520 proc-vmstat.nr_shmem
2987 ± 3% +93.9% 5793 ± 3% proc-vmstat.nr_unevictable
31861 +5.9% 33748 proc-vmstat.nr_zone_active_file
2987 ± 3% +93.9% 5793 ± 3% proc-vmstat.nr_zone_unevictable
721353 ± 2% +13.3% 816958 proc-vmstat.numa_hit
721406 ± 2% +13.2% 816970 proc-vmstat.numa_local
21301 ± 4% +13.1% 24099 proc-vmstat.pgactivate
2459642 ± 3% +15.5% 2840123 ± 2% proc-vmstat.pgalloc_normal
530507 ± 3% +8.0% 573060 proc-vmstat.pgfault
2336660 ± 3% +16.3% 2717873 ± 2% proc-vmstat.pgfree
9973 ± 2% +143.6% 24292 ± 3% proc-vmstat.unevictable_pgs_culled
9875 +146.0% 24292 ± 3% proc-vmstat.unevictable_pgs_rescued
9876 +146.1% 24304 ± 3% proc-vmstat.unevictable_pgs_scanned
1.413e+09 +12.4% 1.588e+09 perf-stat.i.branch-instructions
2.09 -0.1 1.99 ± 2% perf-stat.i.branch-miss-rate%
23305261 +1.8% 23715959 perf-stat.i.branch-misses
4.57 ± 4% +1.5 6.06 ± 2% perf-stat.i.cache-miss-rate%
4971151 ± 5% +71.3% 8515308 ± 2% perf-stat.i.cache-misses
1.771e+08 +6.4% 1.884e+08 perf-stat.i.cache-references
0.91 -3.5% 0.88 ± 2% perf-stat.i.cpi
4.936e+09 +7.8% 5.319e+09 perf-stat.i.cpu-cycles
39.95 ± 3% -15.1% 33.90 ± 2% perf-stat.i.cpu-migrations
1671 ± 4% -34.5% 1095 ± 6% perf-stat.i.cycles-between-cache-misses
2.815e+09 +11.8% 3.147e+09 perf-stat.i.dTLB-loads
0.02 ± 2% -0.0 0.02 ± 4% perf-stat.i.dTLB-store-miss-rate%
6.478e+08 +10.4% 7.153e+08 perf-stat.i.dTLB-stores
40.15 ± 2% -2.9 37.29 perf-stat.i.iTLB-load-miss-rate%
1.063e+10 +12.2% 1.192e+10 perf-stat.i.instructions
40912 ± 4% +25.5% 51361 ± 2% perf-stat.i.instructions-per-iTLB-miss
2.18 +2.7% 2.23 perf-stat.i.ipc
0.41 +7.8% 0.44 perf-stat.i.metric.GHz
421.03 +11.6% 469.88 perf-stat.i.metric.M/sec
3358 +5.5% 3542 perf-stat.i.minor-faults
0.00 ± 11% -0.0 0.00 ± 6% perf-stat.i.node-load-miss-rate%
6.40 ± 8% +17.6% 7.52 ± 5% perf-stat.i.node-load-misses
216221 ± 6% +61.9% 350134 perf-stat.i.node-loads
0.00 ± 16% +0.1 0.12 ±220% perf-stat.i.node-store-miss-rate%
6.72 ± 6% +6.9e+12% 4.615e+11 ±223% perf-stat.i.node-store-misses
694301 ± 9% -14.3% 594881 ± 3% perf-stat.i.node-stores
3361 +5.5% 3545 perf-stat.i.page-faults
0.47 ± 5% +52.6% 0.71 ± 2% perf-stat.overall.MPKI
1.65 -0.2 1.49 perf-stat.overall.branch-miss-rate%
2.81 ± 6% +1.7 4.52 ± 2% perf-stat.overall.cache-miss-rate%
0.46 -4.0% 0.45 perf-stat.overall.cpi
994.89 ± 5% -37.2% 624.75 ± 2% perf-stat.overall.cycles-between-cache-misses
0.03 -0.0 0.02 perf-stat.overall.dTLB-load-miss-rate%
0.01 -0.0 0.01 perf-stat.overall.dTLB-store-miss-rate%
30.99 ± 3% -2.3 28.68 ± 4% perf-stat.overall.iTLB-load-miss-rate%
25510 ± 11% +20.2% 30672 ± 5% perf-stat.overall.instructions-per-iTLB-miss
2.15 +4.1% 2.24 perf-stat.overall.ipc
0.00 ± 8% -0.0 0.00 ± 5% perf-stat.overall.node-load-miss-rate%
0.00 ± 8% +16.7 16.67 ±223% perf-stat.overall.node-store-miss-rate%
1.403e+09 +12.5% 1.578e+09 perf-stat.ps.branch-instructions
23153579 +1.7% 23558574 perf-stat.ps.branch-misses
4938806 ± 5% +71.3% 8457880 ± 2% perf-stat.ps.cache-misses
1.758e+08 +6.4% 1.871e+08 perf-stat.ps.cache-references
4.901e+09 +7.8% 5.282e+09 perf-stat.ps.cpu-cycles
39.66 ± 3% -15.1% 33.67 ± 2% perf-stat.ps.cpu-migrations
2.795e+09 +11.8% 3.125e+09 perf-stat.ps.dTLB-loads
6.432e+08 +10.4% 7.103e+08 perf-stat.ps.dTLB-stores
1.055e+10 +12.2% 1.184e+10 perf-stat.ps.instructions
3336 +5.5% 3518 perf-stat.ps.minor-faults
6.35 ± 8% +17.6% 7.47 ± 5% perf-stat.ps.node-load-misses
214756 ± 6% +61.9% 347761 perf-stat.ps.node-loads
6.67 ± 6% +6.9e+12% 4.577e+11 ±223% perf-stat.ps.node-store-misses
689455 ± 9% -14.3% 590802 ± 3% perf-stat.ps.node-stores
3338 +5.5% 3521 perf-stat.ps.page-faults
1.467e+12 ± 4% +15.7% 1.698e+12 ± 2% perf-stat.total.instructions
***************************************************************************************************
lkp-cfl-d2: 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 32G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/need_x/option_a/rootfs/tbox_group/test/testcase:
gcc-12/performance/x86_64-rhel-8.3/true/2560 x 1440/debian-x86_64-phoronix/lkp-cfl-d2/openarena-1.5.5/phoronix-test-suite
commit:
16a9359401 ("drm/i915: Implement transcoder LRR for TGL+")
54fef7ea35 ("drm/i915/gem: Allow users to disable waitboost")
16a9359401edcbc0 54fef7ea35dadd66193b98805b0
---------------- ---------------------------
%stddev %change %stddev
\ | \
1459 ± 27% +34.9% 1967 ± 13% sched_debug.cpu.curr->pid.max
1.63 -0.5 1.14 ± 2% mpstat.cpu.all.iowait%
1.17 -0.1 1.06 mpstat.cpu.all.irq%
189.00 ± 11% -53.6% 87.67 ± 17% perf-c2c.DRAM.local
20.67 ± 28% -87.9% 2.50 ± 72% perf-c2c.HITM.local
6016 +5.7% 6358 vmstat.io.bi
30168 -5.0% 28666 vmstat.system.cs
27678 -2.7% 26933 vmstat.system.in
337.58 +13.2% 382.30 phoronix-test-suite.openarena.2560x1440.frames_per_second
2.95 -12.2% 2.59 phoronix-test-suite.openarena.2560x1440.milliseconds
191369 +23.8% 236938 phoronix-test-suite.time.minor_page_faults
51.00 +6.5% 54.33 phoronix-test-suite.time.percent_of_cpu_this_job_got
56192 -63.1% 20758 phoronix-test-suite.time.voluntary_context_switches
90640 -64.0% 32673 ± 7% turbostat.C1
0.68 ± 2% -0.3 0.38 ± 5% turbostat.C1%
46385 -12.3% 40688 ± 2% turbostat.C3
1.72 -0.2 1.51 ± 2% turbostat.C3%
3.58 ± 6% -17.6% 2.95 ± 11% turbostat.CPU%c3
5.71 -11.1% 5.08 turbostat.GFXWatt
29983 ± 4% -63.0% 11105 ± 24% turbostat.POLL
0.06 ± 8% -0.0 0.01 ± 35% turbostat.POLL%
1.31 ± 15% +70.2% 2.23 ± 7% turbostat.Pkg%pc2
29.56 -1.4% 29.14 turbostat.PkgWatt
2123 -5.7% 2001 proc-vmstat.nr_active_anon
37869 +2.8% 38912 proc-vmstat.nr_active_file
22828 -5.8% 21501 proc-vmstat.nr_unevictable
2123 -5.7% 2001 proc-vmstat.nr_zone_active_anon
37869 +2.8% 38912 proc-vmstat.nr_zone_active_file
22828 -5.8% 21501 proc-vmstat.nr_zone_unevictable
624578 +4.2% 650937 proc-vmstat.numa_hit
624578 +4.2% 650937 proc-vmstat.numa_local
771624 +3.5% 798755 proc-vmstat.pgalloc_normal
429393 +7.2% 460476 proc-vmstat.pgfault
620059 +4.1% 645242 proc-vmstat.pgfree
8.52 ±126% -7.9 0.62 ±223% perf-profile.calltrace.cycles-pp.wp_page_copy.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
6.09 ±114% -5.4 0.67 ±223% perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
3.76 ±110% -3.8 0.00 perf-profile.calltrace.cycles-pp.error_entry
8.52 ±126% -7.9 0.62 ±223% perf-profile.children.cycles-pp.wp_page_copy
6.09 ±114% -5.4 0.67 ±223% perf-profile.children.cycles-pp.intel_idle
5.79 ±100% -4.6 1.24 ±223% perf-profile.children.cycles-pp.do_filp_open
5.79 ±100% -4.6 1.24 ±223% perf-profile.children.cycles-pp.path_openat
3.76 ±110% -3.8 0.00 perf-profile.children.cycles-pp.error_entry
3.76 ±110% -2.6 1.11 ±223% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
3.76 ±110% -2.6 1.11 ±223% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
3.76 ±110% -2.6 1.11 ±223% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
3.76 ±110% -2.6 1.11 ±223% perf-profile.children.cycles-pp.hrtimer_interrupt
6.09 ±114% -5.4 0.67 ±223% perf-profile.self.cycles-pp.intel_idle
3.76 ±110% -3.8 0.00 perf-profile.self.cycles-pp.error_entry
4.97 ±109% -2.2 2.78 ±223% perf-profile.self.cycles-pp.zap_pte_range
8.07e+08 +2.7% 8.29e+08 perf-stat.i.branch-instructions
31612011 +4.7% 33095236 perf-stat.i.cache-misses
31337 -4.0% 30085 perf-stat.i.context-switches
1.57 +7.7% 1.69 ± 2% perf-stat.i.cpi
4.798e+09 +3.2% 4.95e+09 perf-stat.i.cpu-cycles
0.14 ± 3% +0.0 0.16 ± 3% perf-stat.i.dTLB-load-miss-rate%
1.131e+09 +4.1% 1.177e+09 perf-stat.i.dTLB-loads
0.03 ± 2% +0.0 0.03 ± 2% perf-stat.i.dTLB-store-miss-rate%
134590 +6.1% 142835 perf-stat.i.dTLB-store-misses
5.907e+08 +4.5% 6.175e+08 perf-stat.i.dTLB-stores
3858714 +6.5% 4109072 perf-stat.i.iTLB-loads
4.646e+09 +3.6% 4.812e+09 perf-stat.i.instructions
0.40 +3.2% 0.41 perf-stat.i.metric.GHz
824.17 +6.4% 876.81 perf-stat.i.metric.K/sec
224.04 +3.7% 232.30 perf-stat.i.metric.M/sec
5577 +13.3% 6319 perf-stat.i.minor-faults
1661010 +8.8% 1806501 perf-stat.i.node-loads
4199874 +5.4% 4426929 perf-stat.i.node-stores
5582 +13.3% 6325 perf-stat.i.page-faults
151.84 -1.5% 149.61 perf-stat.overall.cycles-between-cache-misses
7.953e+08 +2.6% 8.16e+08 perf-stat.ps.branch-instructions
31131128 +4.6% 32563072 perf-stat.ps.cache-misses
30863 -4.1% 29604 perf-stat.ps.context-switches
4.727e+09 +3.1% 4.871e+09 perf-stat.ps.cpu-cycles
1.114e+09 +4.0% 1.158e+09 perf-stat.ps.dTLB-loads
132539 +6.0% 140536 perf-stat.ps.dTLB-store-misses
5.818e+08 +4.4% 6.076e+08 perf-stat.ps.dTLB-stores
3798756 +6.4% 4042216 perf-stat.ps.iTLB-loads
4.577e+09 +3.5% 4.736e+09 perf-stat.ps.instructions
5496 +13.2% 6220 perf-stat.ps.minor-faults
1635550 +8.7% 1777301 perf-stat.ps.node-loads
4135301 +5.3% 4355124 perf-stat.ps.node-stores
5501 +13.2% 6226 perf-stat.ps.page-faults
3.045e+11 -2.8% 2.961e+11 perf-stat.total.instructions
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
WARNING: multiple messages have this Message-ID (diff)
From: kernel test robot <oliver.sang@intel.com>
To: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
Rodrigo Vivi <rodrigo.vivi@intel.com>,
<intel-gfx@lists.freedesktop.org>, <ying.huang@intel.com>,
<feng.tang@intel.com>, <fengwei.yin@intel.com>,
<dri-devel@lists.freedesktop.org>,
Vinay Belgaumkar <vinay.belgaumkar@intel.com>,
<oliver.sang@intel.com>
Subject: Re: [PATCH] drm/i915/gem: Allow users to disable waitboost
Date: Tue, 26 Sep 2023 10:58:47 +0800 [thread overview]
Message-ID: <202309261055.b74df987-oliver.sang@intel.com> (raw)
In-Reply-To: <20230920215624.3482244-1-vinay.belgaumkar@intel.com>
Hello,
kernel test robot noticed a -3.2% regression of phoronix-test-suite.paraview.WaveletContour.1024x768.mipolys___sec on:
commit: 54fef7ea35dadd66193b98805b0bc42ef2b279db ("[PATCH] drm/i915/gem: Allow users to disable waitboost")
url: https://github.com/intel-lab-lkp/linux/commits/Vinay-Belgaumkar/drm-i915-gem-Allow-users-to-disable-waitboost/20230921-060357
base: git://anongit.freedesktop.org/drm-intel for-linux-next
patch link: https://lore.kernel.org/all/20230920215624.3482244-1-vinay.belgaumkar@intel.com/
patch subject: [PATCH] drm/i915/gem: Allow users to disable waitboost
testcase: phoronix-test-suite
test machine: 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 32G memory
parameters:
need_x: true
test: paraview-1.0.2
option_a: Wavelet Contour
option_b: 1024 x 768
cpufreq_governor: performance
In addition to that, the commit also has significant impact on the following tests:
+------------------+----------------------------------------------------------------------------------------------------------------+
| testcase: change | phoronix-test-suite: phoronix-test-suite.x11perf.PutImageXY500x500Square.operations___second 12.8% improvement |
| test machine | 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 32G memory |
| test parameters | cpufreq_governor=performance |
| | need_x=true |
| | option_a=PutImage XY 500x500 Square |
| | test=x11perf-1.1.1 |
+------------------+----------------------------------------------------------------------------------------------------------------+
| testcase: change | phoronix-test-suite: phoronix-test-suite.openarena.2560x1440.milliseconds -12.2% regression |
| test machine | 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 32G memory |
| test parameters | cpufreq_governor=performance |
| | need_x=true |
| | option_a=2560 x 1440 |
| | test=openarena-1.5.5 |
+------------------+----------------------------------------------------------------------------------------------------------------+
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202309261055.b74df987-oliver.sang@intel.com
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20230926/202309261055.b74df987-oliver.sang@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/need_x/option_a/option_b/rootfs/tbox_group/test/testcase:
gcc-12/performance/x86_64-rhel-8.3/true/Wavelet Contour/1024 x 768/debian-x86_64-phoronix/lkp-cfl-d2/paraview-1.0.2/phoronix-test-suite
commit:
16a9359401 ("drm/i915: Implement transcoder LRR for TGL+")
54fef7ea35 ("drm/i915/gem: Allow users to disable waitboost")
16a9359401edcbc0 54fef7ea35dadd66193b98805b0
---------------- ---------------------------
%stddev %change %stddev
\ | \
0.05 ± 4% +0.0 0.06 ± 2% mpstat.cpu.all.soft%
66.17 ± 60% +145.6% 162.50 ± 33% turbostat.C10
28.61 -3.3% 27.68 phoronix-test-suite.paraview.WaveletContour.1024x768.frames___sec
298.15 -3.2% 288.49 phoronix-test-suite.paraview.WaveletContour.1024x768.mipolys___sec
535005 +8.6% 580810 phoronix-test-suite.time.minor_page_faults
6278 +8.3% 6797 phoronix-test-suite.time.voluntary_context_switches
801166 +5.6% 845675 proc-vmstat.numa_hit
799382 +5.1% 840353 proc-vmstat.numa_local
59648 +2.6% 61211 proc-vmstat.pgactivate
1539307 +2.7% 1580759 proc-vmstat.pgalloc_normal
734297 +6.8% 783862 proc-vmstat.pgfault
1343231 +3.1% 1385353 proc-vmstat.pgfree
39042 +3.7% 40480 ± 2% proc-vmstat.pgreuse
1.106e+08 +2.1% 1.129e+08 perf-stat.i.cache-references
4.255e+09 +1.9% 4.336e+09 perf-stat.i.cpu-cycles
147872 +2.4% 151392 perf-stat.i.dTLB-store-misses
230242 +2.2% 235419 perf-stat.i.iTLB-load-misses
569455 +2.4% 583234 perf-stat.i.iTLB-loads
0.35 +1.9% 0.36 perf-stat.i.metric.GHz
12547 +8.6% 13625 perf-stat.i.minor-faults
609443 ± 2% +4.0% 633739 ± 2% perf-stat.i.node-loads
1701083 ± 2% +5.0% 1786794 ± 2% perf-stat.i.node-stores
12566 +8.6% 13644 perf-stat.i.page-faults
1.085e+08 +2.1% 1.108e+08 perf-stat.ps.cache-references
4.179e+09 +1.9% 4.259e+09 perf-stat.ps.cpu-cycles
225995 +2.3% 231096 perf-stat.ps.iTLB-load-misses
558694 +2.4% 572272 perf-stat.ps.iTLB-loads
12320 +8.6% 13380 perf-stat.ps.minor-faults
598059 ± 2% +4.0% 621901 ± 2% perf-stat.ps.node-loads
1670891 ± 2% +5.0% 1754927 ± 2% perf-stat.ps.node-stores
12339 +8.6% 13399 perf-stat.ps.page-faults
14.86 ± 42% -11.7 3.12 ±152% perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
14.86 ± 42% -11.7 3.12 ±152% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
14.86 ± 42% -11.7 3.12 ±152% perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64_no_verify
14.86 ± 42% -11.0 3.82 ±161% perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify
11.53 ± 67% -10.1 1.39 ±223% perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary
11.53 ± 67% -9.4 2.08 ±223% perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry
11.53 ± 67% -9.1 2.43 ±143% perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
6.02 ± 95% -5.3 0.70 ±223% perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
5.51 ± 56% -4.8 0.70 ±223% perf-profile.calltrace.cycles-pp.intel_idle_ibrs.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
7.02 ±128% -2.6 4.44 ±147% perf-profile.calltrace.cycles-pp.do_group_exit.get_signal.arch_do_signal_or_restart.exit_to_user_mode_loop.exit_to_user_mode_prepare
7.02 ±128% -2.6 4.44 ±147% perf-profile.calltrace.cycles-pp.do_exit.do_group_exit.get_signal.arch_do_signal_or_restart.exit_to_user_mode_loop
7.02 ±128% -2.6 4.44 ±147% perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
7.02 ±128% -2.6 4.44 ±147% perf-profile.calltrace.cycles-pp.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
7.02 ±128% -2.6 4.44 ±147% perf-profile.calltrace.cycles-pp.arch_do_signal_or_restart.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64
7.02 ±128% -2.6 4.44 ±147% perf-profile.calltrace.cycles-pp.get_signal.arch_do_signal_or_restart.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode
14.86 ± 42% -11.7 3.12 ±152% perf-profile.children.cycles-pp.start_secondary
14.86 ± 42% -11.0 3.82 ±161% perf-profile.children.cycles-pp.secondary_startup_64_no_verify
14.86 ± 42% -11.0 3.82 ±161% perf-profile.children.cycles-pp.cpu_startup_entry
14.86 ± 42% -11.0 3.82 ±161% perf-profile.children.cycles-pp.do_idle
11.53 ± 67% -9.4 2.08 ±223% perf-profile.children.cycles-pp.cpuidle_enter
11.53 ± 67% -9.4 2.08 ±223% perf-profile.children.cycles-pp.cpuidle_enter_state
11.53 ± 67% -8.4 3.12 ±152% perf-profile.children.cycles-pp.cpuidle_idle_call
6.02 ± 95% -5.3 0.70 ±223% perf-profile.children.cycles-pp.intel_idle
5.51 ± 56% -4.8 0.70 ±223% perf-profile.children.cycles-pp.intel_idle_ibrs
7.02 ±128% -2.6 4.44 ±147% perf-profile.children.cycles-pp.exit_to_user_mode_prepare
7.02 ±128% -2.6 4.44 ±147% perf-profile.children.cycles-pp.exit_to_user_mode_loop
7.02 ±128% -2.6 4.44 ±147% perf-profile.children.cycles-pp.arch_do_signal_or_restart
7.02 ±128% -2.6 4.44 ±147% perf-profile.children.cycles-pp.get_signal
6.02 ± 95% -5.3 0.70 ±223% perf-profile.self.cycles-pp.intel_idle
5.51 ± 56% -4.8 0.70 ±223% perf-profile.self.cycles-pp.intel_idle_ibrs
***************************************************************************************************
lkp-cfl-d2: 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 32G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/need_x/option_a/rootfs/tbox_group/test/testcase:
gcc-12/performance/x86_64-rhel-8.3/true/PutImage XY 500x500 Square/debian-x86_64-phoronix/lkp-cfl-d2/x11perf-1.1.1/phoronix-test-suite
commit:
16a9359401 ("drm/i915: Implement transcoder LRR for TGL+")
54fef7ea35 ("drm/i915/gem: Allow users to disable waitboost")
16a9359401edcbc0 54fef7ea35dadd66193b98805b0
---------------- ---------------------------
%stddev %change %stddev
\ | \
11950 ± 3% +93.9% 23173 ± 3% meminfo.Unevictable
1.02 ± 3% -0.6 0.43 ± 3% mpstat.cpu.all.iowait%
21273 -5.3% 20141 vmstat.system.in
4887 ± 32% -67.7% 1579 ± 32% phoronix-test-suite.time.involuntary_context_switches
147212 +26.1% 185677 phoronix-test-suite.time.minor_page_faults
76.83 +10.6% 85.00 phoronix-test-suite.time.percent_of_cpu_this_job_got
96.48 ± 4% +15.0% 110.93 ± 2% phoronix-test-suite.time.user_time
106.83 +12.8% 120.50 phoronix-test-suite.x11perf.PutImageXY500x500Square.operations___second
3.73 ± 3% -0.6 3.12 ± 2% turbostat.C1E%
58316 ± 5% -20.6% 46326 ± 3% turbostat.C3
0.86 ± 2% -0.1 0.74 ± 3% turbostat.C3%
1.24 ± 8% -18.9% 1.01 ± 10% turbostat.CPU%c3
38.67 ± 2% +5.0 43.70 ± 2% turbostat.CPUGFX%
23.44 +4.0% 24.38 turbostat.CorWatt
2.30 ± 4% -71.6% 0.65 ± 2% turbostat.GFXWatt
26.19 -2.7% 25.49 turbostat.PkgWatt
1.41 +2.4% 1.44 turbostat.RAMWatt
61.83 ± 23% -20.2 41.62 ± 31% perf-profile.calltrace.cycles-pp.intel_idle_ibrs.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
61.85 ± 23% -20.2 41.64 ± 31% perf-profile.children.cycles-pp.intel_idle_ibrs
1.24 ± 59% -1.0 0.23 ± 27% perf-profile.children.cycles-pp.worker_thread
0.84 ± 60% -0.7 0.18 ± 48% perf-profile.children.cycles-pp.asm_common_interrupt
0.83 ± 61% -0.7 0.18 ± 48% perf-profile.children.cycles-pp.common_interrupt
0.35 ± 74% -0.3 0.06 ± 85% perf-profile.children.cycles-pp.__common_interrupt
0.35 ± 75% -0.3 0.06 ± 85% perf-profile.children.cycles-pp.handle_edge_irq
0.30 ± 78% -0.3 0.05 ± 81% perf-profile.children.cycles-pp.handle_irq_event
0.37 ± 18% -0.2 0.18 ± 46% perf-profile.children.cycles-pp.newidle_balance
0.12 ± 32% -0.1 0.04 ± 79% perf-profile.children.cycles-pp.exit_mm
0.12 ± 30% -0.1 0.05 ± 73% perf-profile.children.cycles-pp.native_apic_msr_eoi
0.03 ±144% +0.1 0.10 ± 38% perf-profile.children.cycles-pp.dma_resv_iter_first_unlocked
0.04 ±154% +0.1 0.19 ± 40% perf-profile.children.cycles-pp.i915_gem_busy_ioctl
61.82 ± 23% -20.2 41.63 ± 31% perf-profile.self.cycles-pp.intel_idle_ibrs
0.12 ± 26% -0.1 0.05 ± 74% perf-profile.self.cycles-pp.native_apic_msr_eoi
31861 +5.9% 33748 proc-vmstat.nr_active_file
29007 +7.6% 31200 proc-vmstat.nr_mapped
183598 +1.6% 186520 proc-vmstat.nr_shmem
2987 ± 3% +93.9% 5793 ± 3% proc-vmstat.nr_unevictable
31861 +5.9% 33748 proc-vmstat.nr_zone_active_file
2987 ± 3% +93.9% 5793 ± 3% proc-vmstat.nr_zone_unevictable
721353 ± 2% +13.3% 816958 proc-vmstat.numa_hit
721406 ± 2% +13.2% 816970 proc-vmstat.numa_local
21301 ± 4% +13.1% 24099 proc-vmstat.pgactivate
2459642 ± 3% +15.5% 2840123 ± 2% proc-vmstat.pgalloc_normal
530507 ± 3% +8.0% 573060 proc-vmstat.pgfault
2336660 ± 3% +16.3% 2717873 ± 2% proc-vmstat.pgfree
9973 ± 2% +143.6% 24292 ± 3% proc-vmstat.unevictable_pgs_culled
9875 +146.0% 24292 ± 3% proc-vmstat.unevictable_pgs_rescued
9876 +146.1% 24304 ± 3% proc-vmstat.unevictable_pgs_scanned
1.413e+09 +12.4% 1.588e+09 perf-stat.i.branch-instructions
2.09 -0.1 1.99 ± 2% perf-stat.i.branch-miss-rate%
23305261 +1.8% 23715959 perf-stat.i.branch-misses
4.57 ± 4% +1.5 6.06 ± 2% perf-stat.i.cache-miss-rate%
4971151 ± 5% +71.3% 8515308 ± 2% perf-stat.i.cache-misses
1.771e+08 +6.4% 1.884e+08 perf-stat.i.cache-references
0.91 -3.5% 0.88 ± 2% perf-stat.i.cpi
4.936e+09 +7.8% 5.319e+09 perf-stat.i.cpu-cycles
39.95 ± 3% -15.1% 33.90 ± 2% perf-stat.i.cpu-migrations
1671 ± 4% -34.5% 1095 ± 6% perf-stat.i.cycles-between-cache-misses
2.815e+09 +11.8% 3.147e+09 perf-stat.i.dTLB-loads
0.02 ± 2% -0.0 0.02 ± 4% perf-stat.i.dTLB-store-miss-rate%
6.478e+08 +10.4% 7.153e+08 perf-stat.i.dTLB-stores
40.15 ± 2% -2.9 37.29 perf-stat.i.iTLB-load-miss-rate%
1.063e+10 +12.2% 1.192e+10 perf-stat.i.instructions
40912 ± 4% +25.5% 51361 ± 2% perf-stat.i.instructions-per-iTLB-miss
2.18 +2.7% 2.23 perf-stat.i.ipc
0.41 +7.8% 0.44 perf-stat.i.metric.GHz
421.03 +11.6% 469.88 perf-stat.i.metric.M/sec
3358 +5.5% 3542 perf-stat.i.minor-faults
0.00 ± 11% -0.0 0.00 ± 6% perf-stat.i.node-load-miss-rate%
6.40 ± 8% +17.6% 7.52 ± 5% perf-stat.i.node-load-misses
216221 ± 6% +61.9% 350134 perf-stat.i.node-loads
0.00 ± 16% +0.1 0.12 ±220% perf-stat.i.node-store-miss-rate%
6.72 ± 6% +6.9e+12% 4.615e+11 ±223% perf-stat.i.node-store-misses
694301 ± 9% -14.3% 594881 ± 3% perf-stat.i.node-stores
3361 +5.5% 3545 perf-stat.i.page-faults
0.47 ± 5% +52.6% 0.71 ± 2% perf-stat.overall.MPKI
1.65 -0.2 1.49 perf-stat.overall.branch-miss-rate%
2.81 ± 6% +1.7 4.52 ± 2% perf-stat.overall.cache-miss-rate%
0.46 -4.0% 0.45 perf-stat.overall.cpi
994.89 ± 5% -37.2% 624.75 ± 2% perf-stat.overall.cycles-between-cache-misses
0.03 -0.0 0.02 perf-stat.overall.dTLB-load-miss-rate%
0.01 -0.0 0.01 perf-stat.overall.dTLB-store-miss-rate%
30.99 ± 3% -2.3 28.68 ± 4% perf-stat.overall.iTLB-load-miss-rate%
25510 ± 11% +20.2% 30672 ± 5% perf-stat.overall.instructions-per-iTLB-miss
2.15 +4.1% 2.24 perf-stat.overall.ipc
0.00 ± 8% -0.0 0.00 ± 5% perf-stat.overall.node-load-miss-rate%
0.00 ± 8% +16.7 16.67 ±223% perf-stat.overall.node-store-miss-rate%
1.403e+09 +12.5% 1.578e+09 perf-stat.ps.branch-instructions
23153579 +1.7% 23558574 perf-stat.ps.branch-misses
4938806 ± 5% +71.3% 8457880 ± 2% perf-stat.ps.cache-misses
1.758e+08 +6.4% 1.871e+08 perf-stat.ps.cache-references
4.901e+09 +7.8% 5.282e+09 perf-stat.ps.cpu-cycles
39.66 ± 3% -15.1% 33.67 ± 2% perf-stat.ps.cpu-migrations
2.795e+09 +11.8% 3.125e+09 perf-stat.ps.dTLB-loads
6.432e+08 +10.4% 7.103e+08 perf-stat.ps.dTLB-stores
1.055e+10 +12.2% 1.184e+10 perf-stat.ps.instructions
3336 +5.5% 3518 perf-stat.ps.minor-faults
6.35 ± 8% +17.6% 7.47 ± 5% perf-stat.ps.node-load-misses
214756 ± 6% +61.9% 347761 perf-stat.ps.node-loads
6.67 ± 6% +6.9e+12% 4.577e+11 ±223% perf-stat.ps.node-store-misses
689455 ± 9% -14.3% 590802 ± 3% perf-stat.ps.node-stores
3338 +5.5% 3521 perf-stat.ps.page-faults
1.467e+12 ± 4% +15.7% 1.698e+12 ± 2% perf-stat.total.instructions
***************************************************************************************************
lkp-cfl-d2: 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 32G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/need_x/option_a/rootfs/tbox_group/test/testcase:
gcc-12/performance/x86_64-rhel-8.3/true/2560 x 1440/debian-x86_64-phoronix/lkp-cfl-d2/openarena-1.5.5/phoronix-test-suite
commit:
16a9359401 ("drm/i915: Implement transcoder LRR for TGL+")
54fef7ea35 ("drm/i915/gem: Allow users to disable waitboost")
16a9359401edcbc0 54fef7ea35dadd66193b98805b0
---------------- ---------------------------
%stddev %change %stddev
\ | \
1459 ± 27% +34.9% 1967 ± 13% sched_debug.cpu.curr->pid.max
1.63 -0.5 1.14 ± 2% mpstat.cpu.all.iowait%
1.17 -0.1 1.06 mpstat.cpu.all.irq%
189.00 ± 11% -53.6% 87.67 ± 17% perf-c2c.DRAM.local
20.67 ± 28% -87.9% 2.50 ± 72% perf-c2c.HITM.local
6016 +5.7% 6358 vmstat.io.bi
30168 -5.0% 28666 vmstat.system.cs
27678 -2.7% 26933 vmstat.system.in
337.58 +13.2% 382.30 phoronix-test-suite.openarena.2560x1440.frames_per_second
2.95 -12.2% 2.59 phoronix-test-suite.openarena.2560x1440.milliseconds
191369 +23.8% 236938 phoronix-test-suite.time.minor_page_faults
51.00 +6.5% 54.33 phoronix-test-suite.time.percent_of_cpu_this_job_got
56192 -63.1% 20758 phoronix-test-suite.time.voluntary_context_switches
90640 -64.0% 32673 ± 7% turbostat.C1
0.68 ± 2% -0.3 0.38 ± 5% turbostat.C1%
46385 -12.3% 40688 ± 2% turbostat.C3
1.72 -0.2 1.51 ± 2% turbostat.C3%
3.58 ± 6% -17.6% 2.95 ± 11% turbostat.CPU%c3
5.71 -11.1% 5.08 turbostat.GFXWatt
29983 ± 4% -63.0% 11105 ± 24% turbostat.POLL
0.06 ± 8% -0.0 0.01 ± 35% turbostat.POLL%
1.31 ± 15% +70.2% 2.23 ± 7% turbostat.Pkg%pc2
29.56 -1.4% 29.14 turbostat.PkgWatt
2123 -5.7% 2001 proc-vmstat.nr_active_anon
37869 +2.8% 38912 proc-vmstat.nr_active_file
22828 -5.8% 21501 proc-vmstat.nr_unevictable
2123 -5.7% 2001 proc-vmstat.nr_zone_active_anon
37869 +2.8% 38912 proc-vmstat.nr_zone_active_file
22828 -5.8% 21501 proc-vmstat.nr_zone_unevictable
624578 +4.2% 650937 proc-vmstat.numa_hit
624578 +4.2% 650937 proc-vmstat.numa_local
771624 +3.5% 798755 proc-vmstat.pgalloc_normal
429393 +7.2% 460476 proc-vmstat.pgfault
620059 +4.1% 645242 proc-vmstat.pgfree
8.52 ±126% -7.9 0.62 ±223% perf-profile.calltrace.cycles-pp.wp_page_copy.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
6.09 ±114% -5.4 0.67 ±223% perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
3.76 ±110% -3.8 0.00 perf-profile.calltrace.cycles-pp.error_entry
8.52 ±126% -7.9 0.62 ±223% perf-profile.children.cycles-pp.wp_page_copy
6.09 ±114% -5.4 0.67 ±223% perf-profile.children.cycles-pp.intel_idle
5.79 ±100% -4.6 1.24 ±223% perf-profile.children.cycles-pp.do_filp_open
5.79 ±100% -4.6 1.24 ±223% perf-profile.children.cycles-pp.path_openat
3.76 ±110% -3.8 0.00 perf-profile.children.cycles-pp.error_entry
3.76 ±110% -2.6 1.11 ±223% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
3.76 ±110% -2.6 1.11 ±223% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
3.76 ±110% -2.6 1.11 ±223% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
3.76 ±110% -2.6 1.11 ±223% perf-profile.children.cycles-pp.hrtimer_interrupt
6.09 ±114% -5.4 0.67 ±223% perf-profile.self.cycles-pp.intel_idle
3.76 ±110% -3.8 0.00 perf-profile.self.cycles-pp.error_entry
4.97 ±109% -2.2 2.78 ±223% perf-profile.self.cycles-pp.zap_pte_range
8.07e+08 +2.7% 8.29e+08 perf-stat.i.branch-instructions
31612011 +4.7% 33095236 perf-stat.i.cache-misses
31337 -4.0% 30085 perf-stat.i.context-switches
1.57 +7.7% 1.69 ± 2% perf-stat.i.cpi
4.798e+09 +3.2% 4.95e+09 perf-stat.i.cpu-cycles
0.14 ± 3% +0.0 0.16 ± 3% perf-stat.i.dTLB-load-miss-rate%
1.131e+09 +4.1% 1.177e+09 perf-stat.i.dTLB-loads
0.03 ± 2% +0.0 0.03 ± 2% perf-stat.i.dTLB-store-miss-rate%
134590 +6.1% 142835 perf-stat.i.dTLB-store-misses
5.907e+08 +4.5% 6.175e+08 perf-stat.i.dTLB-stores
3858714 +6.5% 4109072 perf-stat.i.iTLB-loads
4.646e+09 +3.6% 4.812e+09 perf-stat.i.instructions
0.40 +3.2% 0.41 perf-stat.i.metric.GHz
824.17 +6.4% 876.81 perf-stat.i.metric.K/sec
224.04 +3.7% 232.30 perf-stat.i.metric.M/sec
5577 +13.3% 6319 perf-stat.i.minor-faults
1661010 +8.8% 1806501 perf-stat.i.node-loads
4199874 +5.4% 4426929 perf-stat.i.node-stores
5582 +13.3% 6325 perf-stat.i.page-faults
151.84 -1.5% 149.61 perf-stat.overall.cycles-between-cache-misses
7.953e+08 +2.6% 8.16e+08 perf-stat.ps.branch-instructions
31131128 +4.6% 32563072 perf-stat.ps.cache-misses
30863 -4.1% 29604 perf-stat.ps.context-switches
4.727e+09 +3.1% 4.871e+09 perf-stat.ps.cpu-cycles
1.114e+09 +4.0% 1.158e+09 perf-stat.ps.dTLB-loads
132539 +6.0% 140536 perf-stat.ps.dTLB-store-misses
5.818e+08 +4.4% 6.076e+08 perf-stat.ps.dTLB-stores
3798756 +6.4% 4042216 perf-stat.ps.iTLB-loads
4.577e+09 +3.5% 4.736e+09 perf-stat.ps.instructions
5496 +13.2% 6220 perf-stat.ps.minor-faults
1635550 +8.7% 1777301 perf-stat.ps.node-loads
4135301 +5.3% 4355124 perf-stat.ps.node-stores
5501 +13.2% 6226 perf-stat.ps.page-faults
3.045e+11 -2.8% 2.961e+11 perf-stat.total.instructions
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
WARNING: multiple messages have this Message-ID (diff)
From: kernel test robot <oliver.sang@intel.com>
To: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: feng.tang@intel.com, lkp@intel.com,
intel-gfx@lists.freedesktop.org,
Rodrigo Vivi <rodrigo.vivi@intel.com>,
fengwei.yin@intel.com, dri-devel@lists.freedesktop.org,
oliver.sang@intel.com, ying.huang@intel.com,
oe-lkp@lists.linux.dev,
Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Subject: Re: [PATCH] drm/i915/gem: Allow users to disable waitboost
Date: Tue, 26 Sep 2023 10:58:47 +0800 [thread overview]
Message-ID: <202309261055.b74df987-oliver.sang@intel.com> (raw)
In-Reply-To: <20230920215624.3482244-1-vinay.belgaumkar@intel.com>
Hello,
kernel test robot noticed a -3.2% regression of phoronix-test-suite.paraview.WaveletContour.1024x768.mipolys___sec on:
commit: 54fef7ea35dadd66193b98805b0bc42ef2b279db ("[PATCH] drm/i915/gem: Allow users to disable waitboost")
url: https://github.com/intel-lab-lkp/linux/commits/Vinay-Belgaumkar/drm-i915-gem-Allow-users-to-disable-waitboost/20230921-060357
base: git://anongit.freedesktop.org/drm-intel for-linux-next
patch link: https://lore.kernel.org/all/20230920215624.3482244-1-vinay.belgaumkar@intel.com/
patch subject: [PATCH] drm/i915/gem: Allow users to disable waitboost
testcase: phoronix-test-suite
test machine: 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 32G memory
parameters:
need_x: true
test: paraview-1.0.2
option_a: Wavelet Contour
option_b: 1024 x 768
cpufreq_governor: performance
In addition to that, the commit also has significant impact on the following tests:
+------------------+----------------------------------------------------------------------------------------------------------------+
| testcase: change | phoronix-test-suite: phoronix-test-suite.x11perf.PutImageXY500x500Square.operations___second 12.8% improvement |
| test machine | 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 32G memory |
| test parameters | cpufreq_governor=performance |
| | need_x=true |
| | option_a=PutImage XY 500x500 Square |
| | test=x11perf-1.1.1 |
+------------------+----------------------------------------------------------------------------------------------------------------+
| testcase: change | phoronix-test-suite: phoronix-test-suite.openarena.2560x1440.milliseconds -12.2% regression |
| test machine | 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 32G memory |
| test parameters | cpufreq_governor=performance |
| | need_x=true |
| | option_a=2560 x 1440 |
| | test=openarena-1.5.5 |
+------------------+----------------------------------------------------------------------------------------------------------------+
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202309261055.b74df987-oliver.sang@intel.com
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20230926/202309261055.b74df987-oliver.sang@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/need_x/option_a/option_b/rootfs/tbox_group/test/testcase:
gcc-12/performance/x86_64-rhel-8.3/true/Wavelet Contour/1024 x 768/debian-x86_64-phoronix/lkp-cfl-d2/paraview-1.0.2/phoronix-test-suite
commit:
16a9359401 ("drm/i915: Implement transcoder LRR for TGL+")
54fef7ea35 ("drm/i915/gem: Allow users to disable waitboost")
16a9359401edcbc0 54fef7ea35dadd66193b98805b0
---------------- ---------------------------
%stddev %change %stddev
\ | \
0.05 ± 4% +0.0 0.06 ± 2% mpstat.cpu.all.soft%
66.17 ± 60% +145.6% 162.50 ± 33% turbostat.C10
28.61 -3.3% 27.68 phoronix-test-suite.paraview.WaveletContour.1024x768.frames___sec
298.15 -3.2% 288.49 phoronix-test-suite.paraview.WaveletContour.1024x768.mipolys___sec
535005 +8.6% 580810 phoronix-test-suite.time.minor_page_faults
6278 +8.3% 6797 phoronix-test-suite.time.voluntary_context_switches
801166 +5.6% 845675 proc-vmstat.numa_hit
799382 +5.1% 840353 proc-vmstat.numa_local
59648 +2.6% 61211 proc-vmstat.pgactivate
1539307 +2.7% 1580759 proc-vmstat.pgalloc_normal
734297 +6.8% 783862 proc-vmstat.pgfault
1343231 +3.1% 1385353 proc-vmstat.pgfree
39042 +3.7% 40480 ± 2% proc-vmstat.pgreuse
1.106e+08 +2.1% 1.129e+08 perf-stat.i.cache-references
4.255e+09 +1.9% 4.336e+09 perf-stat.i.cpu-cycles
147872 +2.4% 151392 perf-stat.i.dTLB-store-misses
230242 +2.2% 235419 perf-stat.i.iTLB-load-misses
569455 +2.4% 583234 perf-stat.i.iTLB-loads
0.35 +1.9% 0.36 perf-stat.i.metric.GHz
12547 +8.6% 13625 perf-stat.i.minor-faults
609443 ± 2% +4.0% 633739 ± 2% perf-stat.i.node-loads
1701083 ± 2% +5.0% 1786794 ± 2% perf-stat.i.node-stores
12566 +8.6% 13644 perf-stat.i.page-faults
1.085e+08 +2.1% 1.108e+08 perf-stat.ps.cache-references
4.179e+09 +1.9% 4.259e+09 perf-stat.ps.cpu-cycles
225995 +2.3% 231096 perf-stat.ps.iTLB-load-misses
558694 +2.4% 572272 perf-stat.ps.iTLB-loads
12320 +8.6% 13380 perf-stat.ps.minor-faults
598059 ± 2% +4.0% 621901 ± 2% perf-stat.ps.node-loads
1670891 ± 2% +5.0% 1754927 ± 2% perf-stat.ps.node-stores
12339 +8.6% 13399 perf-stat.ps.page-faults
14.86 ± 42% -11.7 3.12 ±152% perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
14.86 ± 42% -11.7 3.12 ±152% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
14.86 ± 42% -11.7 3.12 ±152% perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64_no_verify
14.86 ± 42% -11.0 3.82 ±161% perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify
11.53 ± 67% -10.1 1.39 ±223% perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary
11.53 ± 67% -9.4 2.08 ±223% perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry
11.53 ± 67% -9.1 2.43 ±143% perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
6.02 ± 95% -5.3 0.70 ±223% perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
5.51 ± 56% -4.8 0.70 ±223% perf-profile.calltrace.cycles-pp.intel_idle_ibrs.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
7.02 ±128% -2.6 4.44 ±147% perf-profile.calltrace.cycles-pp.do_group_exit.get_signal.arch_do_signal_or_restart.exit_to_user_mode_loop.exit_to_user_mode_prepare
7.02 ±128% -2.6 4.44 ±147% perf-profile.calltrace.cycles-pp.do_exit.do_group_exit.get_signal.arch_do_signal_or_restart.exit_to_user_mode_loop
7.02 ±128% -2.6 4.44 ±147% perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
7.02 ±128% -2.6 4.44 ±147% perf-profile.calltrace.cycles-pp.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
7.02 ±128% -2.6 4.44 ±147% perf-profile.calltrace.cycles-pp.arch_do_signal_or_restart.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64
7.02 ±128% -2.6 4.44 ±147% perf-profile.calltrace.cycles-pp.get_signal.arch_do_signal_or_restart.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode
14.86 ± 42% -11.7 3.12 ±152% perf-profile.children.cycles-pp.start_secondary
14.86 ± 42% -11.0 3.82 ±161% perf-profile.children.cycles-pp.secondary_startup_64_no_verify
14.86 ± 42% -11.0 3.82 ±161% perf-profile.children.cycles-pp.cpu_startup_entry
14.86 ± 42% -11.0 3.82 ±161% perf-profile.children.cycles-pp.do_idle
11.53 ± 67% -9.4 2.08 ±223% perf-profile.children.cycles-pp.cpuidle_enter
11.53 ± 67% -9.4 2.08 ±223% perf-profile.children.cycles-pp.cpuidle_enter_state
11.53 ± 67% -8.4 3.12 ±152% perf-profile.children.cycles-pp.cpuidle_idle_call
6.02 ± 95% -5.3 0.70 ±223% perf-profile.children.cycles-pp.intel_idle
5.51 ± 56% -4.8 0.70 ±223% perf-profile.children.cycles-pp.intel_idle_ibrs
7.02 ±128% -2.6 4.44 ±147% perf-profile.children.cycles-pp.exit_to_user_mode_prepare
7.02 ±128% -2.6 4.44 ±147% perf-profile.children.cycles-pp.exit_to_user_mode_loop
7.02 ±128% -2.6 4.44 ±147% perf-profile.children.cycles-pp.arch_do_signal_or_restart
7.02 ±128% -2.6 4.44 ±147% perf-profile.children.cycles-pp.get_signal
6.02 ± 95% -5.3 0.70 ±223% perf-profile.self.cycles-pp.intel_idle
5.51 ± 56% -4.8 0.70 ±223% perf-profile.self.cycles-pp.intel_idle_ibrs
***************************************************************************************************
lkp-cfl-d2: 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 32G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/need_x/option_a/rootfs/tbox_group/test/testcase:
gcc-12/performance/x86_64-rhel-8.3/true/PutImage XY 500x500 Square/debian-x86_64-phoronix/lkp-cfl-d2/x11perf-1.1.1/phoronix-test-suite
commit:
16a9359401 ("drm/i915: Implement transcoder LRR for TGL+")
54fef7ea35 ("drm/i915/gem: Allow users to disable waitboost")
16a9359401edcbc0 54fef7ea35dadd66193b98805b0
---------------- ---------------------------
%stddev %change %stddev
\ | \
11950 ± 3% +93.9% 23173 ± 3% meminfo.Unevictable
1.02 ± 3% -0.6 0.43 ± 3% mpstat.cpu.all.iowait%
21273 -5.3% 20141 vmstat.system.in
4887 ± 32% -67.7% 1579 ± 32% phoronix-test-suite.time.involuntary_context_switches
147212 +26.1% 185677 phoronix-test-suite.time.minor_page_faults
76.83 +10.6% 85.00 phoronix-test-suite.time.percent_of_cpu_this_job_got
96.48 ± 4% +15.0% 110.93 ± 2% phoronix-test-suite.time.user_time
106.83 +12.8% 120.50 phoronix-test-suite.x11perf.PutImageXY500x500Square.operations___second
3.73 ± 3% -0.6 3.12 ± 2% turbostat.C1E%
58316 ± 5% -20.6% 46326 ± 3% turbostat.C3
0.86 ± 2% -0.1 0.74 ± 3% turbostat.C3%
1.24 ± 8% -18.9% 1.01 ± 10% turbostat.CPU%c3
38.67 ± 2% +5.0 43.70 ± 2% turbostat.CPUGFX%
23.44 +4.0% 24.38 turbostat.CorWatt
2.30 ± 4% -71.6% 0.65 ± 2% turbostat.GFXWatt
26.19 -2.7% 25.49 turbostat.PkgWatt
1.41 +2.4% 1.44 turbostat.RAMWatt
61.83 ± 23% -20.2 41.62 ± 31% perf-profile.calltrace.cycles-pp.intel_idle_ibrs.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
61.85 ± 23% -20.2 41.64 ± 31% perf-profile.children.cycles-pp.intel_idle_ibrs
1.24 ± 59% -1.0 0.23 ± 27% perf-profile.children.cycles-pp.worker_thread
0.84 ± 60% -0.7 0.18 ± 48% perf-profile.children.cycles-pp.asm_common_interrupt
0.83 ± 61% -0.7 0.18 ± 48% perf-profile.children.cycles-pp.common_interrupt
0.35 ± 74% -0.3 0.06 ± 85% perf-profile.children.cycles-pp.__common_interrupt
0.35 ± 75% -0.3 0.06 ± 85% perf-profile.children.cycles-pp.handle_edge_irq
0.30 ± 78% -0.3 0.05 ± 81% perf-profile.children.cycles-pp.handle_irq_event
0.37 ± 18% -0.2 0.18 ± 46% perf-profile.children.cycles-pp.newidle_balance
0.12 ± 32% -0.1 0.04 ± 79% perf-profile.children.cycles-pp.exit_mm
0.12 ± 30% -0.1 0.05 ± 73% perf-profile.children.cycles-pp.native_apic_msr_eoi
0.03 ±144% +0.1 0.10 ± 38% perf-profile.children.cycles-pp.dma_resv_iter_first_unlocked
0.04 ±154% +0.1 0.19 ± 40% perf-profile.children.cycles-pp.i915_gem_busy_ioctl
61.82 ± 23% -20.2 41.63 ± 31% perf-profile.self.cycles-pp.intel_idle_ibrs
0.12 ± 26% -0.1 0.05 ± 74% perf-profile.self.cycles-pp.native_apic_msr_eoi
31861 +5.9% 33748 proc-vmstat.nr_active_file
29007 +7.6% 31200 proc-vmstat.nr_mapped
183598 +1.6% 186520 proc-vmstat.nr_shmem
2987 ± 3% +93.9% 5793 ± 3% proc-vmstat.nr_unevictable
31861 +5.9% 33748 proc-vmstat.nr_zone_active_file
2987 ± 3% +93.9% 5793 ± 3% proc-vmstat.nr_zone_unevictable
721353 ± 2% +13.3% 816958 proc-vmstat.numa_hit
721406 ± 2% +13.2% 816970 proc-vmstat.numa_local
21301 ± 4% +13.1% 24099 proc-vmstat.pgactivate
2459642 ± 3% +15.5% 2840123 ± 2% proc-vmstat.pgalloc_normal
530507 ± 3% +8.0% 573060 proc-vmstat.pgfault
2336660 ± 3% +16.3% 2717873 ± 2% proc-vmstat.pgfree
9973 ± 2% +143.6% 24292 ± 3% proc-vmstat.unevictable_pgs_culled
9875 +146.0% 24292 ± 3% proc-vmstat.unevictable_pgs_rescued
9876 +146.1% 24304 ± 3% proc-vmstat.unevictable_pgs_scanned
1.413e+09 +12.4% 1.588e+09 perf-stat.i.branch-instructions
2.09 -0.1 1.99 ± 2% perf-stat.i.branch-miss-rate%
23305261 +1.8% 23715959 perf-stat.i.branch-misses
4.57 ± 4% +1.5 6.06 ± 2% perf-stat.i.cache-miss-rate%
4971151 ± 5% +71.3% 8515308 ± 2% perf-stat.i.cache-misses
1.771e+08 +6.4% 1.884e+08 perf-stat.i.cache-references
0.91 -3.5% 0.88 ± 2% perf-stat.i.cpi
4.936e+09 +7.8% 5.319e+09 perf-stat.i.cpu-cycles
39.95 ± 3% -15.1% 33.90 ± 2% perf-stat.i.cpu-migrations
1671 ± 4% -34.5% 1095 ± 6% perf-stat.i.cycles-between-cache-misses
2.815e+09 +11.8% 3.147e+09 perf-stat.i.dTLB-loads
0.02 ± 2% -0.0 0.02 ± 4% perf-stat.i.dTLB-store-miss-rate%
6.478e+08 +10.4% 7.153e+08 perf-stat.i.dTLB-stores
40.15 ± 2% -2.9 37.29 perf-stat.i.iTLB-load-miss-rate%
1.063e+10 +12.2% 1.192e+10 perf-stat.i.instructions
40912 ± 4% +25.5% 51361 ± 2% perf-stat.i.instructions-per-iTLB-miss
2.18 +2.7% 2.23 perf-stat.i.ipc
0.41 +7.8% 0.44 perf-stat.i.metric.GHz
421.03 +11.6% 469.88 perf-stat.i.metric.M/sec
3358 +5.5% 3542 perf-stat.i.minor-faults
0.00 ± 11% -0.0 0.00 ± 6% perf-stat.i.node-load-miss-rate%
6.40 ± 8% +17.6% 7.52 ± 5% perf-stat.i.node-load-misses
216221 ± 6% +61.9% 350134 perf-stat.i.node-loads
0.00 ± 16% +0.1 0.12 ±220% perf-stat.i.node-store-miss-rate%
6.72 ± 6% +6.9e+12% 4.615e+11 ±223% perf-stat.i.node-store-misses
694301 ± 9% -14.3% 594881 ± 3% perf-stat.i.node-stores
3361 +5.5% 3545 perf-stat.i.page-faults
0.47 ± 5% +52.6% 0.71 ± 2% perf-stat.overall.MPKI
1.65 -0.2 1.49 perf-stat.overall.branch-miss-rate%
2.81 ± 6% +1.7 4.52 ± 2% perf-stat.overall.cache-miss-rate%
0.46 -4.0% 0.45 perf-stat.overall.cpi
994.89 ± 5% -37.2% 624.75 ± 2% perf-stat.overall.cycles-between-cache-misses
0.03 -0.0 0.02 perf-stat.overall.dTLB-load-miss-rate%
0.01 -0.0 0.01 perf-stat.overall.dTLB-store-miss-rate%
30.99 ± 3% -2.3 28.68 ± 4% perf-stat.overall.iTLB-load-miss-rate%
25510 ± 11% +20.2% 30672 ± 5% perf-stat.overall.instructions-per-iTLB-miss
2.15 +4.1% 2.24 perf-stat.overall.ipc
0.00 ± 8% -0.0 0.00 ± 5% perf-stat.overall.node-load-miss-rate%
0.00 ± 8% +16.7 16.67 ±223% perf-stat.overall.node-store-miss-rate%
1.403e+09 +12.5% 1.578e+09 perf-stat.ps.branch-instructions
23153579 +1.7% 23558574 perf-stat.ps.branch-misses
4938806 ± 5% +71.3% 8457880 ± 2% perf-stat.ps.cache-misses
1.758e+08 +6.4% 1.871e+08 perf-stat.ps.cache-references
4.901e+09 +7.8% 5.282e+09 perf-stat.ps.cpu-cycles
39.66 ± 3% -15.1% 33.67 ± 2% perf-stat.ps.cpu-migrations
2.795e+09 +11.8% 3.125e+09 perf-stat.ps.dTLB-loads
6.432e+08 +10.4% 7.103e+08 perf-stat.ps.dTLB-stores
1.055e+10 +12.2% 1.184e+10 perf-stat.ps.instructions
3336 +5.5% 3518 perf-stat.ps.minor-faults
6.35 ± 8% +17.6% 7.47 ± 5% perf-stat.ps.node-load-misses
214756 ± 6% +61.9% 347761 perf-stat.ps.node-loads
6.67 ± 6% +6.9e+12% 4.577e+11 ±223% perf-stat.ps.node-store-misses
689455 ± 9% -14.3% 590802 ± 3% perf-stat.ps.node-stores
3338 +5.5% 3521 perf-stat.ps.page-faults
1.467e+12 ± 4% +15.7% 1.698e+12 ± 2% perf-stat.total.instructions
***************************************************************************************************
lkp-cfl-d2: 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 32G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/need_x/option_a/rootfs/tbox_group/test/testcase:
gcc-12/performance/x86_64-rhel-8.3/true/2560 x 1440/debian-x86_64-phoronix/lkp-cfl-d2/openarena-1.5.5/phoronix-test-suite
commit:
16a9359401 ("drm/i915: Implement transcoder LRR for TGL+")
54fef7ea35 ("drm/i915/gem: Allow users to disable waitboost")
16a9359401edcbc0 54fef7ea35dadd66193b98805b0
---------------- ---------------------------
%stddev %change %stddev
\ | \
1459 ± 27% +34.9% 1967 ± 13% sched_debug.cpu.curr->pid.max
1.63 -0.5 1.14 ± 2% mpstat.cpu.all.iowait%
1.17 -0.1 1.06 mpstat.cpu.all.irq%
189.00 ± 11% -53.6% 87.67 ± 17% perf-c2c.DRAM.local
20.67 ± 28% -87.9% 2.50 ± 72% perf-c2c.HITM.local
6016 +5.7% 6358 vmstat.io.bi
30168 -5.0% 28666 vmstat.system.cs
27678 -2.7% 26933 vmstat.system.in
337.58 +13.2% 382.30 phoronix-test-suite.openarena.2560x1440.frames_per_second
2.95 -12.2% 2.59 phoronix-test-suite.openarena.2560x1440.milliseconds
191369 +23.8% 236938 phoronix-test-suite.time.minor_page_faults
51.00 +6.5% 54.33 phoronix-test-suite.time.percent_of_cpu_this_job_got
56192 -63.1% 20758 phoronix-test-suite.time.voluntary_context_switches
90640 -64.0% 32673 ± 7% turbostat.C1
0.68 ± 2% -0.3 0.38 ± 5% turbostat.C1%
46385 -12.3% 40688 ± 2% turbostat.C3
1.72 -0.2 1.51 ± 2% turbostat.C3%
3.58 ± 6% -17.6% 2.95 ± 11% turbostat.CPU%c3
5.71 -11.1% 5.08 turbostat.GFXWatt
29983 ± 4% -63.0% 11105 ± 24% turbostat.POLL
0.06 ± 8% -0.0 0.01 ± 35% turbostat.POLL%
1.31 ± 15% +70.2% 2.23 ± 7% turbostat.Pkg%pc2
29.56 -1.4% 29.14 turbostat.PkgWatt
2123 -5.7% 2001 proc-vmstat.nr_active_anon
37869 +2.8% 38912 proc-vmstat.nr_active_file
22828 -5.8% 21501 proc-vmstat.nr_unevictable
2123 -5.7% 2001 proc-vmstat.nr_zone_active_anon
37869 +2.8% 38912 proc-vmstat.nr_zone_active_file
22828 -5.8% 21501 proc-vmstat.nr_zone_unevictable
624578 +4.2% 650937 proc-vmstat.numa_hit
624578 +4.2% 650937 proc-vmstat.numa_local
771624 +3.5% 798755 proc-vmstat.pgalloc_normal
429393 +7.2% 460476 proc-vmstat.pgfault
620059 +4.1% 645242 proc-vmstat.pgfree
8.52 ±126% -7.9 0.62 ±223% perf-profile.calltrace.cycles-pp.wp_page_copy.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
6.09 ±114% -5.4 0.67 ±223% perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
3.76 ±110% -3.8 0.00 perf-profile.calltrace.cycles-pp.error_entry
8.52 ±126% -7.9 0.62 ±223% perf-profile.children.cycles-pp.wp_page_copy
6.09 ±114% -5.4 0.67 ±223% perf-profile.children.cycles-pp.intel_idle
5.79 ±100% -4.6 1.24 ±223% perf-profile.children.cycles-pp.do_filp_open
5.79 ±100% -4.6 1.24 ±223% perf-profile.children.cycles-pp.path_openat
3.76 ±110% -3.8 0.00 perf-profile.children.cycles-pp.error_entry
3.76 ±110% -2.6 1.11 ±223% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
3.76 ±110% -2.6 1.11 ±223% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
3.76 ±110% -2.6 1.11 ±223% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
3.76 ±110% -2.6 1.11 ±223% perf-profile.children.cycles-pp.hrtimer_interrupt
6.09 ±114% -5.4 0.67 ±223% perf-profile.self.cycles-pp.intel_idle
3.76 ±110% -3.8 0.00 perf-profile.self.cycles-pp.error_entry
4.97 ±109% -2.2 2.78 ±223% perf-profile.self.cycles-pp.zap_pte_range
8.07e+08 +2.7% 8.29e+08 perf-stat.i.branch-instructions
31612011 +4.7% 33095236 perf-stat.i.cache-misses
31337 -4.0% 30085 perf-stat.i.context-switches
1.57 +7.7% 1.69 ± 2% perf-stat.i.cpi
4.798e+09 +3.2% 4.95e+09 perf-stat.i.cpu-cycles
0.14 ± 3% +0.0 0.16 ± 3% perf-stat.i.dTLB-load-miss-rate%
1.131e+09 +4.1% 1.177e+09 perf-stat.i.dTLB-loads
0.03 ± 2% +0.0 0.03 ± 2% perf-stat.i.dTLB-store-miss-rate%
134590 +6.1% 142835 perf-stat.i.dTLB-store-misses
5.907e+08 +4.5% 6.175e+08 perf-stat.i.dTLB-stores
3858714 +6.5% 4109072 perf-stat.i.iTLB-loads
4.646e+09 +3.6% 4.812e+09 perf-stat.i.instructions
0.40 +3.2% 0.41 perf-stat.i.metric.GHz
824.17 +6.4% 876.81 perf-stat.i.metric.K/sec
224.04 +3.7% 232.30 perf-stat.i.metric.M/sec
5577 +13.3% 6319 perf-stat.i.minor-faults
1661010 +8.8% 1806501 perf-stat.i.node-loads
4199874 +5.4% 4426929 perf-stat.i.node-stores
5582 +13.3% 6325 perf-stat.i.page-faults
151.84 -1.5% 149.61 perf-stat.overall.cycles-between-cache-misses
7.953e+08 +2.6% 8.16e+08 perf-stat.ps.branch-instructions
31131128 +4.6% 32563072 perf-stat.ps.cache-misses
30863 -4.1% 29604 perf-stat.ps.context-switches
4.727e+09 +3.1% 4.871e+09 perf-stat.ps.cpu-cycles
1.114e+09 +4.0% 1.158e+09 perf-stat.ps.dTLB-loads
132539 +6.0% 140536 perf-stat.ps.dTLB-store-misses
5.818e+08 +4.4% 6.076e+08 perf-stat.ps.dTLB-stores
3798756 +6.4% 4042216 perf-stat.ps.iTLB-loads
4.577e+09 +3.5% 4.736e+09 perf-stat.ps.instructions
5496 +13.2% 6220 perf-stat.ps.minor-faults
1635550 +8.7% 1777301 perf-stat.ps.node-loads
4135301 +5.3% 4355124 perf-stat.ps.node-stores
5501 +13.2% 6226 perf-stat.ps.page-faults
3.045e+11 -2.8% 2.961e+11 perf-stat.total.instructions
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
next prev parent reply other threads:[~2023-09-26 2:59 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-20 21:56 [Intel-gfx] [PATCH] drm/i915/gem: Allow users to disable waitboost Vinay Belgaumkar
2023-09-20 21:56 ` Vinay Belgaumkar
2023-09-21 3:53 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
2023-09-21 3:53 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2023-09-21 4:14 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
2023-09-21 10:41 ` [Intel-gfx] [PATCH] " Tvrtko Ursulin
2023-09-27 19:34 ` Belgaumkar, Vinay
2023-09-28 12:48 ` Tvrtko Ursulin
2023-10-13 20:51 ` Rodrigo Vivi
2023-10-13 20:51 ` Rodrigo Vivi
2023-10-16 8:02 ` Tvrtko Ursulin
2023-10-16 8:02 ` Tvrtko Ursulin
2023-10-16 17:58 ` Rodrigo Vivi
2023-10-16 17:58 ` Rodrigo Vivi
2023-09-26 2:58 ` kernel test robot [this message]
2023-09-26 2:58 ` kernel test robot
2023-09-26 2:58 ` kernel test robot
2023-10-27 5:30 ` [Intel-gfx] " kernel test robot
2023-10-27 5:30 ` kernel test robot
2023-10-27 5:30 ` kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=202309261055.b74df987-oliver.sang@intel.com \
--to=oliver.sang@intel.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=feng.tang@intel.com \
--cc=fengwei.yin@intel.com \
--cc=intel-gfx@lists.freedesktop.org \
--cc=oe-lkp@lists.linux.dev \
--cc=rodrigo.vivi@intel.com \
--cc=vinay.belgaumkar@intel.com \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.