All of lore.kernel.org
 help / color / mirror / Atom feed
From: kernel test robot <oliver.sang@intel.com>
To: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: feng.tang@intel.com, intel-gfx@lists.freedesktop.org,
	Rodrigo Vivi <rodrigo.vivi@intel.com>,
	fengwei.yin@intel.com, dri-devel@lists.freedesktop.org,
	oliver.sang@intel.com, ying.huang@intel.com,
	oe-lkp@lists.linux.dev
Subject: Re: [Intel-gfx] [PATCH] drm/i915/gem: Allow users to disable waitboost
Date: Tue, 26 Sep 2023 10:58:47 +0800	[thread overview]
Message-ID: <202309261055.b74df987-oliver.sang@intel.com> (raw)
In-Reply-To: <20230920215624.3482244-1-vinay.belgaumkar@intel.com>



Hello,

kernel test robot noticed a -3.2% regression of phoronix-test-suite.paraview.WaveletContour.1024x768.mipolys___sec on:


commit: 54fef7ea35dadd66193b98805b0bc42ef2b279db ("[PATCH] drm/i915/gem: Allow users to disable waitboost")
url: https://github.com/intel-lab-lkp/linux/commits/Vinay-Belgaumkar/drm-i915-gem-Allow-users-to-disable-waitboost/20230921-060357
base: git://anongit.freedesktop.org/drm-intel for-linux-next
patch link: https://lore.kernel.org/all/20230920215624.3482244-1-vinay.belgaumkar@intel.com/
patch subject: [PATCH] drm/i915/gem: Allow users to disable waitboost

testcase: phoronix-test-suite
test machine: 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 32G memory
parameters:

	need_x: true
	test: paraview-1.0.2
	option_a: Wavelet Contour
	option_b: 1024 x 768
	cpufreq_governor: performance


In addition to that, the commit also has significant impact on the following tests:

+------------------+----------------------------------------------------------------------------------------------------------------+
| testcase: change | phoronix-test-suite: phoronix-test-suite.x11perf.PutImageXY500x500Square.operations___second 12.8% improvement |
| test machine     | 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 32G memory                     |
| test parameters  | cpufreq_governor=performance                                                                                   |
|                  | need_x=true                                                                                                    |
|                  | option_a=PutImage XY 500x500 Square                                                                            |
|                  | test=x11perf-1.1.1                                                                                             |
+------------------+----------------------------------------------------------------------------------------------------------------+
| testcase: change | phoronix-test-suite: phoronix-test-suite.openarena.2560x1440.milliseconds -12.2% regression                    |
| test machine     | 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 32G memory                     |
| test parameters  | cpufreq_governor=performance                                                                                   |
|                  | need_x=true                                                                                                    |
|                  | option_a=2560 x 1440                                                                                           |
|                  | test=openarena-1.5.5                                                                                           |
+------------------+----------------------------------------------------------------------------------------------------------------+


If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202309261055.b74df987-oliver.sang@intel.com


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20230926/202309261055.b74df987-oliver.sang@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/need_x/option_a/option_b/rootfs/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-8.3/true/Wavelet Contour/1024 x 768/debian-x86_64-phoronix/lkp-cfl-d2/paraview-1.0.2/phoronix-test-suite

commit: 
  16a9359401 ("drm/i915: Implement transcoder LRR for TGL+")
  54fef7ea35 ("drm/i915/gem: Allow users to disable waitboost")

16a9359401edcbc0 54fef7ea35dadd66193b98805b0 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      0.05 ±  4%      +0.0        0.06 ±  2%  mpstat.cpu.all.soft%
     66.17 ± 60%    +145.6%     162.50 ± 33%  turbostat.C10
     28.61            -3.3%      27.68        phoronix-test-suite.paraview.WaveletContour.1024x768.frames___sec
    298.15            -3.2%     288.49        phoronix-test-suite.paraview.WaveletContour.1024x768.mipolys___sec
    535005            +8.6%     580810        phoronix-test-suite.time.minor_page_faults
      6278            +8.3%       6797        phoronix-test-suite.time.voluntary_context_switches
    801166            +5.6%     845675        proc-vmstat.numa_hit
    799382            +5.1%     840353        proc-vmstat.numa_local
     59648            +2.6%      61211        proc-vmstat.pgactivate
   1539307            +2.7%    1580759        proc-vmstat.pgalloc_normal
    734297            +6.8%     783862        proc-vmstat.pgfault
   1343231            +3.1%    1385353        proc-vmstat.pgfree
     39042            +3.7%      40480 ±  2%  proc-vmstat.pgreuse
 1.106e+08            +2.1%  1.129e+08        perf-stat.i.cache-references
 4.255e+09            +1.9%  4.336e+09        perf-stat.i.cpu-cycles
    147872            +2.4%     151392        perf-stat.i.dTLB-store-misses
    230242            +2.2%     235419        perf-stat.i.iTLB-load-misses
    569455            +2.4%     583234        perf-stat.i.iTLB-loads
      0.35            +1.9%       0.36        perf-stat.i.metric.GHz
     12547            +8.6%      13625        perf-stat.i.minor-faults
    609443 ±  2%      +4.0%     633739 ±  2%  perf-stat.i.node-loads
   1701083 ±  2%      +5.0%    1786794 ±  2%  perf-stat.i.node-stores
     12566            +8.6%      13644        perf-stat.i.page-faults
 1.085e+08            +2.1%  1.108e+08        perf-stat.ps.cache-references
 4.179e+09            +1.9%  4.259e+09        perf-stat.ps.cpu-cycles
    225995            +2.3%     231096        perf-stat.ps.iTLB-load-misses
    558694            +2.4%     572272        perf-stat.ps.iTLB-loads
     12320            +8.6%      13380        perf-stat.ps.minor-faults
    598059 ±  2%      +4.0%     621901 ±  2%  perf-stat.ps.node-loads
   1670891 ±  2%      +5.0%    1754927 ±  2%  perf-stat.ps.node-stores
     12339            +8.6%      13399        perf-stat.ps.page-faults
     14.86 ± 42%     -11.7        3.12 ±152%  perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
     14.86 ± 42%     -11.7        3.12 ±152%  perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
     14.86 ± 42%     -11.7        3.12 ±152%  perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64_no_verify
     14.86 ± 42%     -11.0        3.82 ±161%  perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify
     11.53 ± 67%     -10.1        1.39 ±223%  perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary
     11.53 ± 67%      -9.4        2.08 ±223%  perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry
     11.53 ± 67%      -9.1        2.43 ±143%  perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
      6.02 ± 95%      -5.3        0.70 ±223%  perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
      5.51 ± 56%      -4.8        0.70 ±223%  perf-profile.calltrace.cycles-pp.intel_idle_ibrs.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
      7.02 ±128%      -2.6        4.44 ±147%  perf-profile.calltrace.cycles-pp.do_group_exit.get_signal.arch_do_signal_or_restart.exit_to_user_mode_loop.exit_to_user_mode_prepare
      7.02 ±128%      -2.6        4.44 ±147%  perf-profile.calltrace.cycles-pp.do_exit.do_group_exit.get_signal.arch_do_signal_or_restart.exit_to_user_mode_loop
      7.02 ±128%      -2.6        4.44 ±147%  perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
      7.02 ±128%      -2.6        4.44 ±147%  perf-profile.calltrace.cycles-pp.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
      7.02 ±128%      -2.6        4.44 ±147%  perf-profile.calltrace.cycles-pp.arch_do_signal_or_restart.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64
      7.02 ±128%      -2.6        4.44 ±147%  perf-profile.calltrace.cycles-pp.get_signal.arch_do_signal_or_restart.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode
     14.86 ± 42%     -11.7        3.12 ±152%  perf-profile.children.cycles-pp.start_secondary
     14.86 ± 42%     -11.0        3.82 ±161%  perf-profile.children.cycles-pp.secondary_startup_64_no_verify
     14.86 ± 42%     -11.0        3.82 ±161%  perf-profile.children.cycles-pp.cpu_startup_entry
     14.86 ± 42%     -11.0        3.82 ±161%  perf-profile.children.cycles-pp.do_idle
     11.53 ± 67%      -9.4        2.08 ±223%  perf-profile.children.cycles-pp.cpuidle_enter
     11.53 ± 67%      -9.4        2.08 ±223%  perf-profile.children.cycles-pp.cpuidle_enter_state
     11.53 ± 67%      -8.4        3.12 ±152%  perf-profile.children.cycles-pp.cpuidle_idle_call
      6.02 ± 95%      -5.3        0.70 ±223%  perf-profile.children.cycles-pp.intel_idle
      5.51 ± 56%      -4.8        0.70 ±223%  perf-profile.children.cycles-pp.intel_idle_ibrs
      7.02 ±128%      -2.6        4.44 ±147%  perf-profile.children.cycles-pp.exit_to_user_mode_prepare
      7.02 ±128%      -2.6        4.44 ±147%  perf-profile.children.cycles-pp.exit_to_user_mode_loop
      7.02 ±128%      -2.6        4.44 ±147%  perf-profile.children.cycles-pp.arch_do_signal_or_restart
      7.02 ±128%      -2.6        4.44 ±147%  perf-profile.children.cycles-pp.get_signal
      6.02 ± 95%      -5.3        0.70 ±223%  perf-profile.self.cycles-pp.intel_idle
      5.51 ± 56%      -4.8        0.70 ±223%  perf-profile.self.cycles-pp.intel_idle_ibrs


***************************************************************************************************
lkp-cfl-d2: 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 32G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/need_x/option_a/rootfs/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-8.3/true/PutImage XY 500x500 Square/debian-x86_64-phoronix/lkp-cfl-d2/x11perf-1.1.1/phoronix-test-suite

commit: 
  16a9359401 ("drm/i915: Implement transcoder LRR for TGL+")
  54fef7ea35 ("drm/i915/gem: Allow users to disable waitboost")

16a9359401edcbc0 54fef7ea35dadd66193b98805b0 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     11950 ±  3%     +93.9%      23173 ±  3%  meminfo.Unevictable
      1.02 ±  3%      -0.6        0.43 ±  3%  mpstat.cpu.all.iowait%
     21273            -5.3%      20141        vmstat.system.in
      4887 ± 32%     -67.7%       1579 ± 32%  phoronix-test-suite.time.involuntary_context_switches
    147212           +26.1%     185677        phoronix-test-suite.time.minor_page_faults
     76.83           +10.6%      85.00        phoronix-test-suite.time.percent_of_cpu_this_job_got
     96.48 ±  4%     +15.0%     110.93 ±  2%  phoronix-test-suite.time.user_time
    106.83           +12.8%     120.50        phoronix-test-suite.x11perf.PutImageXY500x500Square.operations___second
      3.73 ±  3%      -0.6        3.12 ±  2%  turbostat.C1E%
     58316 ±  5%     -20.6%      46326 ±  3%  turbostat.C3
      0.86 ±  2%      -0.1        0.74 ±  3%  turbostat.C3%
      1.24 ±  8%     -18.9%       1.01 ± 10%  turbostat.CPU%c3
     38.67 ±  2%      +5.0       43.70 ±  2%  turbostat.CPUGFX%
     23.44            +4.0%      24.38        turbostat.CorWatt
      2.30 ±  4%     -71.6%       0.65 ±  2%  turbostat.GFXWatt
     26.19            -2.7%      25.49        turbostat.PkgWatt
      1.41            +2.4%       1.44        turbostat.RAMWatt
     61.83 ± 23%     -20.2       41.62 ± 31%  perf-profile.calltrace.cycles-pp.intel_idle_ibrs.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
     61.85 ± 23%     -20.2       41.64 ± 31%  perf-profile.children.cycles-pp.intel_idle_ibrs
      1.24 ± 59%      -1.0        0.23 ± 27%  perf-profile.children.cycles-pp.worker_thread
      0.84 ± 60%      -0.7        0.18 ± 48%  perf-profile.children.cycles-pp.asm_common_interrupt
      0.83 ± 61%      -0.7        0.18 ± 48%  perf-profile.children.cycles-pp.common_interrupt
      0.35 ± 74%      -0.3        0.06 ± 85%  perf-profile.children.cycles-pp.__common_interrupt
      0.35 ± 75%      -0.3        0.06 ± 85%  perf-profile.children.cycles-pp.handle_edge_irq
      0.30 ± 78%      -0.3        0.05 ± 81%  perf-profile.children.cycles-pp.handle_irq_event
      0.37 ± 18%      -0.2        0.18 ± 46%  perf-profile.children.cycles-pp.newidle_balance
      0.12 ± 32%      -0.1        0.04 ± 79%  perf-profile.children.cycles-pp.exit_mm
      0.12 ± 30%      -0.1        0.05 ± 73%  perf-profile.children.cycles-pp.native_apic_msr_eoi
      0.03 ±144%      +0.1        0.10 ± 38%  perf-profile.children.cycles-pp.dma_resv_iter_first_unlocked
      0.04 ±154%      +0.1        0.19 ± 40%  perf-profile.children.cycles-pp.i915_gem_busy_ioctl
     61.82 ± 23%     -20.2       41.63 ± 31%  perf-profile.self.cycles-pp.intel_idle_ibrs
      0.12 ± 26%      -0.1        0.05 ± 74%  perf-profile.self.cycles-pp.native_apic_msr_eoi
     31861            +5.9%      33748        proc-vmstat.nr_active_file
     29007            +7.6%      31200        proc-vmstat.nr_mapped
    183598            +1.6%     186520        proc-vmstat.nr_shmem
      2987 ±  3%     +93.9%       5793 ±  3%  proc-vmstat.nr_unevictable
     31861            +5.9%      33748        proc-vmstat.nr_zone_active_file
      2987 ±  3%     +93.9%       5793 ±  3%  proc-vmstat.nr_zone_unevictable
    721353 ±  2%     +13.3%     816958        proc-vmstat.numa_hit
    721406 ±  2%     +13.2%     816970        proc-vmstat.numa_local
     21301 ±  4%     +13.1%      24099        proc-vmstat.pgactivate
   2459642 ±  3%     +15.5%    2840123 ±  2%  proc-vmstat.pgalloc_normal
    530507 ±  3%      +8.0%     573060        proc-vmstat.pgfault
   2336660 ±  3%     +16.3%    2717873 ±  2%  proc-vmstat.pgfree
      9973 ±  2%    +143.6%      24292 ±  3%  proc-vmstat.unevictable_pgs_culled
      9875          +146.0%      24292 ±  3%  proc-vmstat.unevictable_pgs_rescued
      9876          +146.1%      24304 ±  3%  proc-vmstat.unevictable_pgs_scanned
 1.413e+09           +12.4%  1.588e+09        perf-stat.i.branch-instructions
      2.09            -0.1        1.99 ±  2%  perf-stat.i.branch-miss-rate%
  23305261            +1.8%   23715959        perf-stat.i.branch-misses
      4.57 ±  4%      +1.5        6.06 ±  2%  perf-stat.i.cache-miss-rate%
   4971151 ±  5%     +71.3%    8515308 ±  2%  perf-stat.i.cache-misses
 1.771e+08            +6.4%  1.884e+08        perf-stat.i.cache-references
      0.91            -3.5%       0.88 ±  2%  perf-stat.i.cpi
 4.936e+09            +7.8%  5.319e+09        perf-stat.i.cpu-cycles
     39.95 ±  3%     -15.1%      33.90 ±  2%  perf-stat.i.cpu-migrations
      1671 ±  4%     -34.5%       1095 ±  6%  perf-stat.i.cycles-between-cache-misses
 2.815e+09           +11.8%  3.147e+09        perf-stat.i.dTLB-loads
      0.02 ±  2%      -0.0        0.02 ±  4%  perf-stat.i.dTLB-store-miss-rate%
 6.478e+08           +10.4%  7.153e+08        perf-stat.i.dTLB-stores
     40.15 ±  2%      -2.9       37.29        perf-stat.i.iTLB-load-miss-rate%
 1.063e+10           +12.2%  1.192e+10        perf-stat.i.instructions
     40912 ±  4%     +25.5%      51361 ±  2%  perf-stat.i.instructions-per-iTLB-miss
      2.18            +2.7%       2.23        perf-stat.i.ipc
      0.41            +7.8%       0.44        perf-stat.i.metric.GHz
    421.03           +11.6%     469.88        perf-stat.i.metric.M/sec
      3358            +5.5%       3542        perf-stat.i.minor-faults
      0.00 ± 11%      -0.0        0.00 ±  6%  perf-stat.i.node-load-miss-rate%
      6.40 ±  8%     +17.6%       7.52 ±  5%  perf-stat.i.node-load-misses
    216221 ±  6%     +61.9%     350134        perf-stat.i.node-loads
      0.00 ± 16%      +0.1        0.12 ±220%  perf-stat.i.node-store-miss-rate%
      6.72 ±  6%  +6.9e+12%  4.615e+11 ±223%  perf-stat.i.node-store-misses
    694301 ±  9%     -14.3%     594881 ±  3%  perf-stat.i.node-stores
      3361            +5.5%       3545        perf-stat.i.page-faults
      0.47 ±  5%     +52.6%       0.71 ±  2%  perf-stat.overall.MPKI
      1.65            -0.2        1.49        perf-stat.overall.branch-miss-rate%
      2.81 ±  6%      +1.7        4.52 ±  2%  perf-stat.overall.cache-miss-rate%
      0.46            -4.0%       0.45        perf-stat.overall.cpi
    994.89 ±  5%     -37.2%     624.75 ±  2%  perf-stat.overall.cycles-between-cache-misses
      0.03            -0.0        0.02        perf-stat.overall.dTLB-load-miss-rate%
      0.01            -0.0        0.01        perf-stat.overall.dTLB-store-miss-rate%
     30.99 ±  3%      -2.3       28.68 ±  4%  perf-stat.overall.iTLB-load-miss-rate%
     25510 ± 11%     +20.2%      30672 ±  5%  perf-stat.overall.instructions-per-iTLB-miss
      2.15            +4.1%       2.24        perf-stat.overall.ipc
      0.00 ±  8%      -0.0        0.00 ±  5%  perf-stat.overall.node-load-miss-rate%
      0.00 ±  8%     +16.7       16.67 ±223%  perf-stat.overall.node-store-miss-rate%
 1.403e+09           +12.5%  1.578e+09        perf-stat.ps.branch-instructions
  23153579            +1.7%   23558574        perf-stat.ps.branch-misses
   4938806 ±  5%     +71.3%    8457880 ±  2%  perf-stat.ps.cache-misses
 1.758e+08            +6.4%  1.871e+08        perf-stat.ps.cache-references
 4.901e+09            +7.8%  5.282e+09        perf-stat.ps.cpu-cycles
     39.66 ±  3%     -15.1%      33.67 ±  2%  perf-stat.ps.cpu-migrations
 2.795e+09           +11.8%  3.125e+09        perf-stat.ps.dTLB-loads
 6.432e+08           +10.4%  7.103e+08        perf-stat.ps.dTLB-stores
 1.055e+10           +12.2%  1.184e+10        perf-stat.ps.instructions
      3336            +5.5%       3518        perf-stat.ps.minor-faults
      6.35 ±  8%     +17.6%       7.47 ±  5%  perf-stat.ps.node-load-misses
    214756 ±  6%     +61.9%     347761        perf-stat.ps.node-loads
      6.67 ±  6%  +6.9e+12%  4.577e+11 ±223%  perf-stat.ps.node-store-misses
    689455 ±  9%     -14.3%     590802 ±  3%  perf-stat.ps.node-stores
      3338            +5.5%       3521        perf-stat.ps.page-faults
 1.467e+12 ±  4%     +15.7%  1.698e+12 ±  2%  perf-stat.total.instructions



***************************************************************************************************
lkp-cfl-d2: 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 32G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/need_x/option_a/rootfs/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-8.3/true/2560 x 1440/debian-x86_64-phoronix/lkp-cfl-d2/openarena-1.5.5/phoronix-test-suite

commit: 
  16a9359401 ("drm/i915: Implement transcoder LRR for TGL+")
  54fef7ea35 ("drm/i915/gem: Allow users to disable waitboost")

16a9359401edcbc0 54fef7ea35dadd66193b98805b0 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      1459 ± 27%     +34.9%       1967 ± 13%  sched_debug.cpu.curr->pid.max
      1.63            -0.5        1.14 ±  2%  mpstat.cpu.all.iowait%
      1.17            -0.1        1.06        mpstat.cpu.all.irq%
    189.00 ± 11%     -53.6%      87.67 ± 17%  perf-c2c.DRAM.local
     20.67 ± 28%     -87.9%       2.50 ± 72%  perf-c2c.HITM.local
      6016            +5.7%       6358        vmstat.io.bi
     30168            -5.0%      28666        vmstat.system.cs
     27678            -2.7%      26933        vmstat.system.in
    337.58           +13.2%     382.30        phoronix-test-suite.openarena.2560x1440.frames_per_second
      2.95           -12.2%       2.59        phoronix-test-suite.openarena.2560x1440.milliseconds
    191369           +23.8%     236938        phoronix-test-suite.time.minor_page_faults
     51.00            +6.5%      54.33        phoronix-test-suite.time.percent_of_cpu_this_job_got
     56192           -63.1%      20758        phoronix-test-suite.time.voluntary_context_switches
     90640           -64.0%      32673 ±  7%  turbostat.C1
      0.68 ±  2%      -0.3        0.38 ±  5%  turbostat.C1%
     46385           -12.3%      40688 ±  2%  turbostat.C3
      1.72            -0.2        1.51 ±  2%  turbostat.C3%
      3.58 ±  6%     -17.6%       2.95 ± 11%  turbostat.CPU%c3
      5.71           -11.1%       5.08        turbostat.GFXWatt
     29983 ±  4%     -63.0%      11105 ± 24%  turbostat.POLL
      0.06 ±  8%      -0.0        0.01 ± 35%  turbostat.POLL%
      1.31 ± 15%     +70.2%       2.23 ±  7%  turbostat.Pkg%pc2
     29.56            -1.4%      29.14        turbostat.PkgWatt
      2123            -5.7%       2001        proc-vmstat.nr_active_anon
     37869            +2.8%      38912        proc-vmstat.nr_active_file
     22828            -5.8%      21501        proc-vmstat.nr_unevictable
      2123            -5.7%       2001        proc-vmstat.nr_zone_active_anon
     37869            +2.8%      38912        proc-vmstat.nr_zone_active_file
     22828            -5.8%      21501        proc-vmstat.nr_zone_unevictable
    624578            +4.2%     650937        proc-vmstat.numa_hit
    624578            +4.2%     650937        proc-vmstat.numa_local
    771624            +3.5%     798755        proc-vmstat.pgalloc_normal
    429393            +7.2%     460476        proc-vmstat.pgfault
    620059            +4.1%     645242        proc-vmstat.pgfree
      8.52 ±126%      -7.9        0.62 ±223%  perf-profile.calltrace.cycles-pp.wp_page_copy.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
      6.09 ±114%      -5.4        0.67 ±223%  perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
      3.76 ±110%      -3.8        0.00        perf-profile.calltrace.cycles-pp.error_entry
      8.52 ±126%      -7.9        0.62 ±223%  perf-profile.children.cycles-pp.wp_page_copy
      6.09 ±114%      -5.4        0.67 ±223%  perf-profile.children.cycles-pp.intel_idle
      5.79 ±100%      -4.6        1.24 ±223%  perf-profile.children.cycles-pp.do_filp_open
      5.79 ±100%      -4.6        1.24 ±223%  perf-profile.children.cycles-pp.path_openat
      3.76 ±110%      -3.8        0.00        perf-profile.children.cycles-pp.error_entry
      3.76 ±110%      -2.6        1.11 ±223%  perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
      3.76 ±110%      -2.6        1.11 ±223%  perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
      3.76 ±110%      -2.6        1.11 ±223%  perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
      3.76 ±110%      -2.6        1.11 ±223%  perf-profile.children.cycles-pp.hrtimer_interrupt
      6.09 ±114%      -5.4        0.67 ±223%  perf-profile.self.cycles-pp.intel_idle
      3.76 ±110%      -3.8        0.00        perf-profile.self.cycles-pp.error_entry
      4.97 ±109%      -2.2        2.78 ±223%  perf-profile.self.cycles-pp.zap_pte_range
  8.07e+08            +2.7%   8.29e+08        perf-stat.i.branch-instructions
  31612011            +4.7%   33095236        perf-stat.i.cache-misses
     31337            -4.0%      30085        perf-stat.i.context-switches
      1.57            +7.7%       1.69 ±  2%  perf-stat.i.cpi
 4.798e+09            +3.2%   4.95e+09        perf-stat.i.cpu-cycles
      0.14 ±  3%      +0.0        0.16 ±  3%  perf-stat.i.dTLB-load-miss-rate%
 1.131e+09            +4.1%  1.177e+09        perf-stat.i.dTLB-loads
      0.03 ±  2%      +0.0        0.03 ±  2%  perf-stat.i.dTLB-store-miss-rate%
    134590            +6.1%     142835        perf-stat.i.dTLB-store-misses
 5.907e+08            +4.5%  6.175e+08        perf-stat.i.dTLB-stores
   3858714            +6.5%    4109072        perf-stat.i.iTLB-loads
 4.646e+09            +3.6%  4.812e+09        perf-stat.i.instructions
      0.40            +3.2%       0.41        perf-stat.i.metric.GHz
    824.17            +6.4%     876.81        perf-stat.i.metric.K/sec
    224.04            +3.7%     232.30        perf-stat.i.metric.M/sec
      5577           +13.3%       6319        perf-stat.i.minor-faults
   1661010            +8.8%    1806501        perf-stat.i.node-loads
   4199874            +5.4%    4426929        perf-stat.i.node-stores
      5582           +13.3%       6325        perf-stat.i.page-faults
    151.84            -1.5%     149.61        perf-stat.overall.cycles-between-cache-misses
 7.953e+08            +2.6%   8.16e+08        perf-stat.ps.branch-instructions
  31131128            +4.6%   32563072        perf-stat.ps.cache-misses
     30863            -4.1%      29604        perf-stat.ps.context-switches
 4.727e+09            +3.1%  4.871e+09        perf-stat.ps.cpu-cycles
 1.114e+09            +4.0%  1.158e+09        perf-stat.ps.dTLB-loads
    132539            +6.0%     140536        perf-stat.ps.dTLB-store-misses
 5.818e+08            +4.4%  6.076e+08        perf-stat.ps.dTLB-stores
   3798756            +6.4%    4042216        perf-stat.ps.iTLB-loads
 4.577e+09            +3.5%  4.736e+09        perf-stat.ps.instructions
      5496           +13.2%       6220        perf-stat.ps.minor-faults
   1635550            +8.7%    1777301        perf-stat.ps.node-loads
   4135301            +5.3%    4355124        perf-stat.ps.node-stores
      5501           +13.2%       6226        perf-stat.ps.page-faults
 3.045e+11            -2.8%  2.961e+11        perf-stat.total.instructions





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


WARNING: multiple messages have this Message-ID (diff)
From: kernel test robot <oliver.sang@intel.com>
To: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
	Rodrigo Vivi <rodrigo.vivi@intel.com>,
	<intel-gfx@lists.freedesktop.org>, <ying.huang@intel.com>,
	<feng.tang@intel.com>, <fengwei.yin@intel.com>,
	<dri-devel@lists.freedesktop.org>,
	Vinay Belgaumkar <vinay.belgaumkar@intel.com>,
	<oliver.sang@intel.com>
Subject: Re: [PATCH] drm/i915/gem: Allow users to disable waitboost
Date: Tue, 26 Sep 2023 10:58:47 +0800	[thread overview]
Message-ID: <202309261055.b74df987-oliver.sang@intel.com> (raw)
In-Reply-To: <20230920215624.3482244-1-vinay.belgaumkar@intel.com>



Hello,

kernel test robot noticed a -3.2% regression of phoronix-test-suite.paraview.WaveletContour.1024x768.mipolys___sec on:


commit: 54fef7ea35dadd66193b98805b0bc42ef2b279db ("[PATCH] drm/i915/gem: Allow users to disable waitboost")
url: https://github.com/intel-lab-lkp/linux/commits/Vinay-Belgaumkar/drm-i915-gem-Allow-users-to-disable-waitboost/20230921-060357
base: git://anongit.freedesktop.org/drm-intel for-linux-next
patch link: https://lore.kernel.org/all/20230920215624.3482244-1-vinay.belgaumkar@intel.com/
patch subject: [PATCH] drm/i915/gem: Allow users to disable waitboost

testcase: phoronix-test-suite
test machine: 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 32G memory
parameters:

	need_x: true
	test: paraview-1.0.2
	option_a: Wavelet Contour
	option_b: 1024 x 768
	cpufreq_governor: performance


In addition to that, the commit also has significant impact on the following tests:

+------------------+----------------------------------------------------------------------------------------------------------------+
| testcase: change | phoronix-test-suite: phoronix-test-suite.x11perf.PutImageXY500x500Square.operations___second 12.8% improvement |
| test machine     | 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 32G memory                     |
| test parameters  | cpufreq_governor=performance                                                                                   |
|                  | need_x=true                                                                                                    |
|                  | option_a=PutImage XY 500x500 Square                                                                            |
|                  | test=x11perf-1.1.1                                                                                             |
+------------------+----------------------------------------------------------------------------------------------------------------+
| testcase: change | phoronix-test-suite: phoronix-test-suite.openarena.2560x1440.milliseconds -12.2% regression                    |
| test machine     | 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 32G memory                     |
| test parameters  | cpufreq_governor=performance                                                                                   |
|                  | need_x=true                                                                                                    |
|                  | option_a=2560 x 1440                                                                                           |
|                  | test=openarena-1.5.5                                                                                           |
+------------------+----------------------------------------------------------------------------------------------------------------+


If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202309261055.b74df987-oliver.sang@intel.com


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20230926/202309261055.b74df987-oliver.sang@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/need_x/option_a/option_b/rootfs/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-8.3/true/Wavelet Contour/1024 x 768/debian-x86_64-phoronix/lkp-cfl-d2/paraview-1.0.2/phoronix-test-suite

commit: 
  16a9359401 ("drm/i915: Implement transcoder LRR for TGL+")
  54fef7ea35 ("drm/i915/gem: Allow users to disable waitboost")

16a9359401edcbc0 54fef7ea35dadd66193b98805b0 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      0.05 ±  4%      +0.0        0.06 ±  2%  mpstat.cpu.all.soft%
     66.17 ± 60%    +145.6%     162.50 ± 33%  turbostat.C10
     28.61            -3.3%      27.68        phoronix-test-suite.paraview.WaveletContour.1024x768.frames___sec
    298.15            -3.2%     288.49        phoronix-test-suite.paraview.WaveletContour.1024x768.mipolys___sec
    535005            +8.6%     580810        phoronix-test-suite.time.minor_page_faults
      6278            +8.3%       6797        phoronix-test-suite.time.voluntary_context_switches
    801166            +5.6%     845675        proc-vmstat.numa_hit
    799382            +5.1%     840353        proc-vmstat.numa_local
     59648            +2.6%      61211        proc-vmstat.pgactivate
   1539307            +2.7%    1580759        proc-vmstat.pgalloc_normal
    734297            +6.8%     783862        proc-vmstat.pgfault
   1343231            +3.1%    1385353        proc-vmstat.pgfree
     39042            +3.7%      40480 ±  2%  proc-vmstat.pgreuse
 1.106e+08            +2.1%  1.129e+08        perf-stat.i.cache-references
 4.255e+09            +1.9%  4.336e+09        perf-stat.i.cpu-cycles
    147872            +2.4%     151392        perf-stat.i.dTLB-store-misses
    230242            +2.2%     235419        perf-stat.i.iTLB-load-misses
    569455            +2.4%     583234        perf-stat.i.iTLB-loads
      0.35            +1.9%       0.36        perf-stat.i.metric.GHz
     12547            +8.6%      13625        perf-stat.i.minor-faults
    609443 ±  2%      +4.0%     633739 ±  2%  perf-stat.i.node-loads
   1701083 ±  2%      +5.0%    1786794 ±  2%  perf-stat.i.node-stores
     12566            +8.6%      13644        perf-stat.i.page-faults
 1.085e+08            +2.1%  1.108e+08        perf-stat.ps.cache-references
 4.179e+09            +1.9%  4.259e+09        perf-stat.ps.cpu-cycles
    225995            +2.3%     231096        perf-stat.ps.iTLB-load-misses
    558694            +2.4%     572272        perf-stat.ps.iTLB-loads
     12320            +8.6%      13380        perf-stat.ps.minor-faults
    598059 ±  2%      +4.0%     621901 ±  2%  perf-stat.ps.node-loads
   1670891 ±  2%      +5.0%    1754927 ±  2%  perf-stat.ps.node-stores
     12339            +8.6%      13399        perf-stat.ps.page-faults
     14.86 ± 42%     -11.7        3.12 ±152%  perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
     14.86 ± 42%     -11.7        3.12 ±152%  perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
     14.86 ± 42%     -11.7        3.12 ±152%  perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64_no_verify
     14.86 ± 42%     -11.0        3.82 ±161%  perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify
     11.53 ± 67%     -10.1        1.39 ±223%  perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary
     11.53 ± 67%      -9.4        2.08 ±223%  perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry
     11.53 ± 67%      -9.1        2.43 ±143%  perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
      6.02 ± 95%      -5.3        0.70 ±223%  perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
      5.51 ± 56%      -4.8        0.70 ±223%  perf-profile.calltrace.cycles-pp.intel_idle_ibrs.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
      7.02 ±128%      -2.6        4.44 ±147%  perf-profile.calltrace.cycles-pp.do_group_exit.get_signal.arch_do_signal_or_restart.exit_to_user_mode_loop.exit_to_user_mode_prepare
      7.02 ±128%      -2.6        4.44 ±147%  perf-profile.calltrace.cycles-pp.do_exit.do_group_exit.get_signal.arch_do_signal_or_restart.exit_to_user_mode_loop
      7.02 ±128%      -2.6        4.44 ±147%  perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
      7.02 ±128%      -2.6        4.44 ±147%  perf-profile.calltrace.cycles-pp.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
      7.02 ±128%      -2.6        4.44 ±147%  perf-profile.calltrace.cycles-pp.arch_do_signal_or_restart.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64
      7.02 ±128%      -2.6        4.44 ±147%  perf-profile.calltrace.cycles-pp.get_signal.arch_do_signal_or_restart.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode
     14.86 ± 42%     -11.7        3.12 ±152%  perf-profile.children.cycles-pp.start_secondary
     14.86 ± 42%     -11.0        3.82 ±161%  perf-profile.children.cycles-pp.secondary_startup_64_no_verify
     14.86 ± 42%     -11.0        3.82 ±161%  perf-profile.children.cycles-pp.cpu_startup_entry
     14.86 ± 42%     -11.0        3.82 ±161%  perf-profile.children.cycles-pp.do_idle
     11.53 ± 67%      -9.4        2.08 ±223%  perf-profile.children.cycles-pp.cpuidle_enter
     11.53 ± 67%      -9.4        2.08 ±223%  perf-profile.children.cycles-pp.cpuidle_enter_state
     11.53 ± 67%      -8.4        3.12 ±152%  perf-profile.children.cycles-pp.cpuidle_idle_call
      6.02 ± 95%      -5.3        0.70 ±223%  perf-profile.children.cycles-pp.intel_idle
      5.51 ± 56%      -4.8        0.70 ±223%  perf-profile.children.cycles-pp.intel_idle_ibrs
      7.02 ±128%      -2.6        4.44 ±147%  perf-profile.children.cycles-pp.exit_to_user_mode_prepare
      7.02 ±128%      -2.6        4.44 ±147%  perf-profile.children.cycles-pp.exit_to_user_mode_loop
      7.02 ±128%      -2.6        4.44 ±147%  perf-profile.children.cycles-pp.arch_do_signal_or_restart
      7.02 ±128%      -2.6        4.44 ±147%  perf-profile.children.cycles-pp.get_signal
      6.02 ± 95%      -5.3        0.70 ±223%  perf-profile.self.cycles-pp.intel_idle
      5.51 ± 56%      -4.8        0.70 ±223%  perf-profile.self.cycles-pp.intel_idle_ibrs


***************************************************************************************************
lkp-cfl-d2: 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 32G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/need_x/option_a/rootfs/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-8.3/true/PutImage XY 500x500 Square/debian-x86_64-phoronix/lkp-cfl-d2/x11perf-1.1.1/phoronix-test-suite

commit: 
  16a9359401 ("drm/i915: Implement transcoder LRR for TGL+")
  54fef7ea35 ("drm/i915/gem: Allow users to disable waitboost")

16a9359401edcbc0 54fef7ea35dadd66193b98805b0 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     11950 ±  3%     +93.9%      23173 ±  3%  meminfo.Unevictable
      1.02 ±  3%      -0.6        0.43 ±  3%  mpstat.cpu.all.iowait%
     21273            -5.3%      20141        vmstat.system.in
      4887 ± 32%     -67.7%       1579 ± 32%  phoronix-test-suite.time.involuntary_context_switches
    147212           +26.1%     185677        phoronix-test-suite.time.minor_page_faults
     76.83           +10.6%      85.00        phoronix-test-suite.time.percent_of_cpu_this_job_got
     96.48 ±  4%     +15.0%     110.93 ±  2%  phoronix-test-suite.time.user_time
    106.83           +12.8%     120.50        phoronix-test-suite.x11perf.PutImageXY500x500Square.operations___second
      3.73 ±  3%      -0.6        3.12 ±  2%  turbostat.C1E%
     58316 ±  5%     -20.6%      46326 ±  3%  turbostat.C3
      0.86 ±  2%      -0.1        0.74 ±  3%  turbostat.C3%
      1.24 ±  8%     -18.9%       1.01 ± 10%  turbostat.CPU%c3
     38.67 ±  2%      +5.0       43.70 ±  2%  turbostat.CPUGFX%
     23.44            +4.0%      24.38        turbostat.CorWatt
      2.30 ±  4%     -71.6%       0.65 ±  2%  turbostat.GFXWatt
     26.19            -2.7%      25.49        turbostat.PkgWatt
      1.41            +2.4%       1.44        turbostat.RAMWatt
     61.83 ± 23%     -20.2       41.62 ± 31%  perf-profile.calltrace.cycles-pp.intel_idle_ibrs.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
     61.85 ± 23%     -20.2       41.64 ± 31%  perf-profile.children.cycles-pp.intel_idle_ibrs
      1.24 ± 59%      -1.0        0.23 ± 27%  perf-profile.children.cycles-pp.worker_thread
      0.84 ± 60%      -0.7        0.18 ± 48%  perf-profile.children.cycles-pp.asm_common_interrupt
      0.83 ± 61%      -0.7        0.18 ± 48%  perf-profile.children.cycles-pp.common_interrupt
      0.35 ± 74%      -0.3        0.06 ± 85%  perf-profile.children.cycles-pp.__common_interrupt
      0.35 ± 75%      -0.3        0.06 ± 85%  perf-profile.children.cycles-pp.handle_edge_irq
      0.30 ± 78%      -0.3        0.05 ± 81%  perf-profile.children.cycles-pp.handle_irq_event
      0.37 ± 18%      -0.2        0.18 ± 46%  perf-profile.children.cycles-pp.newidle_balance
      0.12 ± 32%      -0.1        0.04 ± 79%  perf-profile.children.cycles-pp.exit_mm
      0.12 ± 30%      -0.1        0.05 ± 73%  perf-profile.children.cycles-pp.native_apic_msr_eoi
      0.03 ±144%      +0.1        0.10 ± 38%  perf-profile.children.cycles-pp.dma_resv_iter_first_unlocked
      0.04 ±154%      +0.1        0.19 ± 40%  perf-profile.children.cycles-pp.i915_gem_busy_ioctl
     61.82 ± 23%     -20.2       41.63 ± 31%  perf-profile.self.cycles-pp.intel_idle_ibrs
      0.12 ± 26%      -0.1        0.05 ± 74%  perf-profile.self.cycles-pp.native_apic_msr_eoi
     31861            +5.9%      33748        proc-vmstat.nr_active_file
     29007            +7.6%      31200        proc-vmstat.nr_mapped
    183598            +1.6%     186520        proc-vmstat.nr_shmem
      2987 ±  3%     +93.9%       5793 ±  3%  proc-vmstat.nr_unevictable
     31861            +5.9%      33748        proc-vmstat.nr_zone_active_file
      2987 ±  3%     +93.9%       5793 ±  3%  proc-vmstat.nr_zone_unevictable
    721353 ±  2%     +13.3%     816958        proc-vmstat.numa_hit
    721406 ±  2%     +13.2%     816970        proc-vmstat.numa_local
     21301 ±  4%     +13.1%      24099        proc-vmstat.pgactivate
   2459642 ±  3%     +15.5%    2840123 ±  2%  proc-vmstat.pgalloc_normal
    530507 ±  3%      +8.0%     573060        proc-vmstat.pgfault
   2336660 ±  3%     +16.3%    2717873 ±  2%  proc-vmstat.pgfree
      9973 ±  2%    +143.6%      24292 ±  3%  proc-vmstat.unevictable_pgs_culled
      9875          +146.0%      24292 ±  3%  proc-vmstat.unevictable_pgs_rescued
      9876          +146.1%      24304 ±  3%  proc-vmstat.unevictable_pgs_scanned
 1.413e+09           +12.4%  1.588e+09        perf-stat.i.branch-instructions
      2.09            -0.1        1.99 ±  2%  perf-stat.i.branch-miss-rate%
  23305261            +1.8%   23715959        perf-stat.i.branch-misses
      4.57 ±  4%      +1.5        6.06 ±  2%  perf-stat.i.cache-miss-rate%
   4971151 ±  5%     +71.3%    8515308 ±  2%  perf-stat.i.cache-misses
 1.771e+08            +6.4%  1.884e+08        perf-stat.i.cache-references
      0.91            -3.5%       0.88 ±  2%  perf-stat.i.cpi
 4.936e+09            +7.8%  5.319e+09        perf-stat.i.cpu-cycles
     39.95 ±  3%     -15.1%      33.90 ±  2%  perf-stat.i.cpu-migrations
      1671 ±  4%     -34.5%       1095 ±  6%  perf-stat.i.cycles-between-cache-misses
 2.815e+09           +11.8%  3.147e+09        perf-stat.i.dTLB-loads
      0.02 ±  2%      -0.0        0.02 ±  4%  perf-stat.i.dTLB-store-miss-rate%
 6.478e+08           +10.4%  7.153e+08        perf-stat.i.dTLB-stores
     40.15 ±  2%      -2.9       37.29        perf-stat.i.iTLB-load-miss-rate%
 1.063e+10           +12.2%  1.192e+10        perf-stat.i.instructions
     40912 ±  4%     +25.5%      51361 ±  2%  perf-stat.i.instructions-per-iTLB-miss
      2.18            +2.7%       2.23        perf-stat.i.ipc
      0.41            +7.8%       0.44        perf-stat.i.metric.GHz
    421.03           +11.6%     469.88        perf-stat.i.metric.M/sec
      3358            +5.5%       3542        perf-stat.i.minor-faults
      0.00 ± 11%      -0.0        0.00 ±  6%  perf-stat.i.node-load-miss-rate%
      6.40 ±  8%     +17.6%       7.52 ±  5%  perf-stat.i.node-load-misses
    216221 ±  6%     +61.9%     350134        perf-stat.i.node-loads
      0.00 ± 16%      +0.1        0.12 ±220%  perf-stat.i.node-store-miss-rate%
      6.72 ±  6%  +6.9e+12%  4.615e+11 ±223%  perf-stat.i.node-store-misses
    694301 ±  9%     -14.3%     594881 ±  3%  perf-stat.i.node-stores
      3361            +5.5%       3545        perf-stat.i.page-faults
      0.47 ±  5%     +52.6%       0.71 ±  2%  perf-stat.overall.MPKI
      1.65            -0.2        1.49        perf-stat.overall.branch-miss-rate%
      2.81 ±  6%      +1.7        4.52 ±  2%  perf-stat.overall.cache-miss-rate%
      0.46            -4.0%       0.45        perf-stat.overall.cpi
    994.89 ±  5%     -37.2%     624.75 ±  2%  perf-stat.overall.cycles-between-cache-misses
      0.03            -0.0        0.02        perf-stat.overall.dTLB-load-miss-rate%
      0.01            -0.0        0.01        perf-stat.overall.dTLB-store-miss-rate%
     30.99 ±  3%      -2.3       28.68 ±  4%  perf-stat.overall.iTLB-load-miss-rate%
     25510 ± 11%     +20.2%      30672 ±  5%  perf-stat.overall.instructions-per-iTLB-miss
      2.15            +4.1%       2.24        perf-stat.overall.ipc
      0.00 ±  8%      -0.0        0.00 ±  5%  perf-stat.overall.node-load-miss-rate%
      0.00 ±  8%     +16.7       16.67 ±223%  perf-stat.overall.node-store-miss-rate%
 1.403e+09           +12.5%  1.578e+09        perf-stat.ps.branch-instructions
  23153579            +1.7%   23558574        perf-stat.ps.branch-misses
   4938806 ±  5%     +71.3%    8457880 ±  2%  perf-stat.ps.cache-misses
 1.758e+08            +6.4%  1.871e+08        perf-stat.ps.cache-references
 4.901e+09            +7.8%  5.282e+09        perf-stat.ps.cpu-cycles
     39.66 ±  3%     -15.1%      33.67 ±  2%  perf-stat.ps.cpu-migrations
 2.795e+09           +11.8%  3.125e+09        perf-stat.ps.dTLB-loads
 6.432e+08           +10.4%  7.103e+08        perf-stat.ps.dTLB-stores
 1.055e+10           +12.2%  1.184e+10        perf-stat.ps.instructions
      3336            +5.5%       3518        perf-stat.ps.minor-faults
      6.35 ±  8%     +17.6%       7.47 ±  5%  perf-stat.ps.node-load-misses
    214756 ±  6%     +61.9%     347761        perf-stat.ps.node-loads
      6.67 ±  6%  +6.9e+12%  4.577e+11 ±223%  perf-stat.ps.node-store-misses
    689455 ±  9%     -14.3%     590802 ±  3%  perf-stat.ps.node-stores
      3338            +5.5%       3521        perf-stat.ps.page-faults
 1.467e+12 ±  4%     +15.7%  1.698e+12 ±  2%  perf-stat.total.instructions



***************************************************************************************************
lkp-cfl-d2: 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 32G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/need_x/option_a/rootfs/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-8.3/true/2560 x 1440/debian-x86_64-phoronix/lkp-cfl-d2/openarena-1.5.5/phoronix-test-suite

commit: 
  16a9359401 ("drm/i915: Implement transcoder LRR for TGL+")
  54fef7ea35 ("drm/i915/gem: Allow users to disable waitboost")

16a9359401edcbc0 54fef7ea35dadd66193b98805b0 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      1459 ± 27%     +34.9%       1967 ± 13%  sched_debug.cpu.curr->pid.max
      1.63            -0.5        1.14 ±  2%  mpstat.cpu.all.iowait%
      1.17            -0.1        1.06        mpstat.cpu.all.irq%
    189.00 ± 11%     -53.6%      87.67 ± 17%  perf-c2c.DRAM.local
     20.67 ± 28%     -87.9%       2.50 ± 72%  perf-c2c.HITM.local
      6016            +5.7%       6358        vmstat.io.bi
     30168            -5.0%      28666        vmstat.system.cs
     27678            -2.7%      26933        vmstat.system.in
    337.58           +13.2%     382.30        phoronix-test-suite.openarena.2560x1440.frames_per_second
      2.95           -12.2%       2.59        phoronix-test-suite.openarena.2560x1440.milliseconds
    191369           +23.8%     236938        phoronix-test-suite.time.minor_page_faults
     51.00            +6.5%      54.33        phoronix-test-suite.time.percent_of_cpu_this_job_got
     56192           -63.1%      20758        phoronix-test-suite.time.voluntary_context_switches
     90640           -64.0%      32673 ±  7%  turbostat.C1
      0.68 ±  2%      -0.3        0.38 ±  5%  turbostat.C1%
     46385           -12.3%      40688 ±  2%  turbostat.C3
      1.72            -0.2        1.51 ±  2%  turbostat.C3%
      3.58 ±  6%     -17.6%       2.95 ± 11%  turbostat.CPU%c3
      5.71           -11.1%       5.08        turbostat.GFXWatt
     29983 ±  4%     -63.0%      11105 ± 24%  turbostat.POLL
      0.06 ±  8%      -0.0        0.01 ± 35%  turbostat.POLL%
      1.31 ± 15%     +70.2%       2.23 ±  7%  turbostat.Pkg%pc2
     29.56            -1.4%      29.14        turbostat.PkgWatt
      2123            -5.7%       2001        proc-vmstat.nr_active_anon
     37869            +2.8%      38912        proc-vmstat.nr_active_file
     22828            -5.8%      21501        proc-vmstat.nr_unevictable
      2123            -5.7%       2001        proc-vmstat.nr_zone_active_anon
     37869            +2.8%      38912        proc-vmstat.nr_zone_active_file
     22828            -5.8%      21501        proc-vmstat.nr_zone_unevictable
    624578            +4.2%     650937        proc-vmstat.numa_hit
    624578            +4.2%     650937        proc-vmstat.numa_local
    771624            +3.5%     798755        proc-vmstat.pgalloc_normal
    429393            +7.2%     460476        proc-vmstat.pgfault
    620059            +4.1%     645242        proc-vmstat.pgfree
      8.52 ±126%      -7.9        0.62 ±223%  perf-profile.calltrace.cycles-pp.wp_page_copy.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
      6.09 ±114%      -5.4        0.67 ±223%  perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
      3.76 ±110%      -3.8        0.00        perf-profile.calltrace.cycles-pp.error_entry
      8.52 ±126%      -7.9        0.62 ±223%  perf-profile.children.cycles-pp.wp_page_copy
      6.09 ±114%      -5.4        0.67 ±223%  perf-profile.children.cycles-pp.intel_idle
      5.79 ±100%      -4.6        1.24 ±223%  perf-profile.children.cycles-pp.do_filp_open
      5.79 ±100%      -4.6        1.24 ±223%  perf-profile.children.cycles-pp.path_openat
      3.76 ±110%      -3.8        0.00        perf-profile.children.cycles-pp.error_entry
      3.76 ±110%      -2.6        1.11 ±223%  perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
      3.76 ±110%      -2.6        1.11 ±223%  perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
      3.76 ±110%      -2.6        1.11 ±223%  perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
      3.76 ±110%      -2.6        1.11 ±223%  perf-profile.children.cycles-pp.hrtimer_interrupt
      6.09 ±114%      -5.4        0.67 ±223%  perf-profile.self.cycles-pp.intel_idle
      3.76 ±110%      -3.8        0.00        perf-profile.self.cycles-pp.error_entry
      4.97 ±109%      -2.2        2.78 ±223%  perf-profile.self.cycles-pp.zap_pte_range
  8.07e+08            +2.7%   8.29e+08        perf-stat.i.branch-instructions
  31612011            +4.7%   33095236        perf-stat.i.cache-misses
     31337            -4.0%      30085        perf-stat.i.context-switches
      1.57            +7.7%       1.69 ±  2%  perf-stat.i.cpi
 4.798e+09            +3.2%   4.95e+09        perf-stat.i.cpu-cycles
      0.14 ±  3%      +0.0        0.16 ±  3%  perf-stat.i.dTLB-load-miss-rate%
 1.131e+09            +4.1%  1.177e+09        perf-stat.i.dTLB-loads
      0.03 ±  2%      +0.0        0.03 ±  2%  perf-stat.i.dTLB-store-miss-rate%
    134590            +6.1%     142835        perf-stat.i.dTLB-store-misses
 5.907e+08            +4.5%  6.175e+08        perf-stat.i.dTLB-stores
   3858714            +6.5%    4109072        perf-stat.i.iTLB-loads
 4.646e+09            +3.6%  4.812e+09        perf-stat.i.instructions
      0.40            +3.2%       0.41        perf-stat.i.metric.GHz
    824.17            +6.4%     876.81        perf-stat.i.metric.K/sec
    224.04            +3.7%     232.30        perf-stat.i.metric.M/sec
      5577           +13.3%       6319        perf-stat.i.minor-faults
   1661010            +8.8%    1806501        perf-stat.i.node-loads
   4199874            +5.4%    4426929        perf-stat.i.node-stores
      5582           +13.3%       6325        perf-stat.i.page-faults
    151.84            -1.5%     149.61        perf-stat.overall.cycles-between-cache-misses
 7.953e+08            +2.6%   8.16e+08        perf-stat.ps.branch-instructions
  31131128            +4.6%   32563072        perf-stat.ps.cache-misses
     30863            -4.1%      29604        perf-stat.ps.context-switches
 4.727e+09            +3.1%  4.871e+09        perf-stat.ps.cpu-cycles
 1.114e+09            +4.0%  1.158e+09        perf-stat.ps.dTLB-loads
    132539            +6.0%     140536        perf-stat.ps.dTLB-store-misses
 5.818e+08            +4.4%  6.076e+08        perf-stat.ps.dTLB-stores
   3798756            +6.4%    4042216        perf-stat.ps.iTLB-loads
 4.577e+09            +3.5%  4.736e+09        perf-stat.ps.instructions
      5496           +13.2%       6220        perf-stat.ps.minor-faults
   1635550            +8.7%    1777301        perf-stat.ps.node-loads
   4135301            +5.3%    4355124        perf-stat.ps.node-stores
      5501           +13.2%       6226        perf-stat.ps.page-faults
 3.045e+11            -2.8%  2.961e+11        perf-stat.total.instructions





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


WARNING: multiple messages have this Message-ID (diff)
From: kernel test robot <oliver.sang@intel.com>
To: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: feng.tang@intel.com, lkp@intel.com,
	intel-gfx@lists.freedesktop.org,
	Rodrigo Vivi <rodrigo.vivi@intel.com>,
	fengwei.yin@intel.com, dri-devel@lists.freedesktop.org,
	oliver.sang@intel.com, ying.huang@intel.com,
	oe-lkp@lists.linux.dev,
	Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Subject: Re: [PATCH] drm/i915/gem: Allow users to disable waitboost
Date: Tue, 26 Sep 2023 10:58:47 +0800	[thread overview]
Message-ID: <202309261055.b74df987-oliver.sang@intel.com> (raw)
In-Reply-To: <20230920215624.3482244-1-vinay.belgaumkar@intel.com>



Hello,

kernel test robot noticed a -3.2% regression of phoronix-test-suite.paraview.WaveletContour.1024x768.mipolys___sec on:


commit: 54fef7ea35dadd66193b98805b0bc42ef2b279db ("[PATCH] drm/i915/gem: Allow users to disable waitboost")
url: https://github.com/intel-lab-lkp/linux/commits/Vinay-Belgaumkar/drm-i915-gem-Allow-users-to-disable-waitboost/20230921-060357
base: git://anongit.freedesktop.org/drm-intel for-linux-next
patch link: https://lore.kernel.org/all/20230920215624.3482244-1-vinay.belgaumkar@intel.com/
patch subject: [PATCH] drm/i915/gem: Allow users to disable waitboost

testcase: phoronix-test-suite
test machine: 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 32G memory
parameters:

	need_x: true
	test: paraview-1.0.2
	option_a: Wavelet Contour
	option_b: 1024 x 768
	cpufreq_governor: performance


In addition to that, the commit also has significant impact on the following tests:

+------------------+----------------------------------------------------------------------------------------------------------------+
| testcase: change | phoronix-test-suite: phoronix-test-suite.x11perf.PutImageXY500x500Square.operations___second 12.8% improvement |
| test machine     | 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 32G memory                     |
| test parameters  | cpufreq_governor=performance                                                                                   |
|                  | need_x=true                                                                                                    |
|                  | option_a=PutImage XY 500x500 Square                                                                            |
|                  | test=x11perf-1.1.1                                                                                             |
+------------------+----------------------------------------------------------------------------------------------------------------+
| testcase: change | phoronix-test-suite: phoronix-test-suite.openarena.2560x1440.milliseconds -12.2% regression                    |
| test machine     | 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 32G memory                     |
| test parameters  | cpufreq_governor=performance                                                                                   |
|                  | need_x=true                                                                                                    |
|                  | option_a=2560 x 1440                                                                                           |
|                  | test=openarena-1.5.5                                                                                           |
+------------------+----------------------------------------------------------------------------------------------------------------+


If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202309261055.b74df987-oliver.sang@intel.com


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20230926/202309261055.b74df987-oliver.sang@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/need_x/option_a/option_b/rootfs/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-8.3/true/Wavelet Contour/1024 x 768/debian-x86_64-phoronix/lkp-cfl-d2/paraview-1.0.2/phoronix-test-suite

commit: 
  16a9359401 ("drm/i915: Implement transcoder LRR for TGL+")
  54fef7ea35 ("drm/i915/gem: Allow users to disable waitboost")

16a9359401edcbc0 54fef7ea35dadd66193b98805b0 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      0.05 ±  4%      +0.0        0.06 ±  2%  mpstat.cpu.all.soft%
     66.17 ± 60%    +145.6%     162.50 ± 33%  turbostat.C10
     28.61            -3.3%      27.68        phoronix-test-suite.paraview.WaveletContour.1024x768.frames___sec
    298.15            -3.2%     288.49        phoronix-test-suite.paraview.WaveletContour.1024x768.mipolys___sec
    535005            +8.6%     580810        phoronix-test-suite.time.minor_page_faults
      6278            +8.3%       6797        phoronix-test-suite.time.voluntary_context_switches
    801166            +5.6%     845675        proc-vmstat.numa_hit
    799382            +5.1%     840353        proc-vmstat.numa_local
     59648            +2.6%      61211        proc-vmstat.pgactivate
   1539307            +2.7%    1580759        proc-vmstat.pgalloc_normal
    734297            +6.8%     783862        proc-vmstat.pgfault
   1343231            +3.1%    1385353        proc-vmstat.pgfree
     39042            +3.7%      40480 ±  2%  proc-vmstat.pgreuse
 1.106e+08            +2.1%  1.129e+08        perf-stat.i.cache-references
 4.255e+09            +1.9%  4.336e+09        perf-stat.i.cpu-cycles
    147872            +2.4%     151392        perf-stat.i.dTLB-store-misses
    230242            +2.2%     235419        perf-stat.i.iTLB-load-misses
    569455            +2.4%     583234        perf-stat.i.iTLB-loads
      0.35            +1.9%       0.36        perf-stat.i.metric.GHz
     12547            +8.6%      13625        perf-stat.i.minor-faults
    609443 ±  2%      +4.0%     633739 ±  2%  perf-stat.i.node-loads
   1701083 ±  2%      +5.0%    1786794 ±  2%  perf-stat.i.node-stores
     12566            +8.6%      13644        perf-stat.i.page-faults
 1.085e+08            +2.1%  1.108e+08        perf-stat.ps.cache-references
 4.179e+09            +1.9%  4.259e+09        perf-stat.ps.cpu-cycles
    225995            +2.3%     231096        perf-stat.ps.iTLB-load-misses
    558694            +2.4%     572272        perf-stat.ps.iTLB-loads
     12320            +8.6%      13380        perf-stat.ps.minor-faults
    598059 ±  2%      +4.0%     621901 ±  2%  perf-stat.ps.node-loads
   1670891 ±  2%      +5.0%    1754927 ±  2%  perf-stat.ps.node-stores
     12339            +8.6%      13399        perf-stat.ps.page-faults
     14.86 ± 42%     -11.7        3.12 ±152%  perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
     14.86 ± 42%     -11.7        3.12 ±152%  perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
     14.86 ± 42%     -11.7        3.12 ±152%  perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64_no_verify
     14.86 ± 42%     -11.0        3.82 ±161%  perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify
     11.53 ± 67%     -10.1        1.39 ±223%  perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary
     11.53 ± 67%      -9.4        2.08 ±223%  perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry
     11.53 ± 67%      -9.1        2.43 ±143%  perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
      6.02 ± 95%      -5.3        0.70 ±223%  perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
      5.51 ± 56%      -4.8        0.70 ±223%  perf-profile.calltrace.cycles-pp.intel_idle_ibrs.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
      7.02 ±128%      -2.6        4.44 ±147%  perf-profile.calltrace.cycles-pp.do_group_exit.get_signal.arch_do_signal_or_restart.exit_to_user_mode_loop.exit_to_user_mode_prepare
      7.02 ±128%      -2.6        4.44 ±147%  perf-profile.calltrace.cycles-pp.do_exit.do_group_exit.get_signal.arch_do_signal_or_restart.exit_to_user_mode_loop
      7.02 ±128%      -2.6        4.44 ±147%  perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
      7.02 ±128%      -2.6        4.44 ±147%  perf-profile.calltrace.cycles-pp.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
      7.02 ±128%      -2.6        4.44 ±147%  perf-profile.calltrace.cycles-pp.arch_do_signal_or_restart.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64
      7.02 ±128%      -2.6        4.44 ±147%  perf-profile.calltrace.cycles-pp.get_signal.arch_do_signal_or_restart.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode
     14.86 ± 42%     -11.7        3.12 ±152%  perf-profile.children.cycles-pp.start_secondary
     14.86 ± 42%     -11.0        3.82 ±161%  perf-profile.children.cycles-pp.secondary_startup_64_no_verify
     14.86 ± 42%     -11.0        3.82 ±161%  perf-profile.children.cycles-pp.cpu_startup_entry
     14.86 ± 42%     -11.0        3.82 ±161%  perf-profile.children.cycles-pp.do_idle
     11.53 ± 67%      -9.4        2.08 ±223%  perf-profile.children.cycles-pp.cpuidle_enter
     11.53 ± 67%      -9.4        2.08 ±223%  perf-profile.children.cycles-pp.cpuidle_enter_state
     11.53 ± 67%      -8.4        3.12 ±152%  perf-profile.children.cycles-pp.cpuidle_idle_call
      6.02 ± 95%      -5.3        0.70 ±223%  perf-profile.children.cycles-pp.intel_idle
      5.51 ± 56%      -4.8        0.70 ±223%  perf-profile.children.cycles-pp.intel_idle_ibrs
      7.02 ±128%      -2.6        4.44 ±147%  perf-profile.children.cycles-pp.exit_to_user_mode_prepare
      7.02 ±128%      -2.6        4.44 ±147%  perf-profile.children.cycles-pp.exit_to_user_mode_loop
      7.02 ±128%      -2.6        4.44 ±147%  perf-profile.children.cycles-pp.arch_do_signal_or_restart
      7.02 ±128%      -2.6        4.44 ±147%  perf-profile.children.cycles-pp.get_signal
      6.02 ± 95%      -5.3        0.70 ±223%  perf-profile.self.cycles-pp.intel_idle
      5.51 ± 56%      -4.8        0.70 ±223%  perf-profile.self.cycles-pp.intel_idle_ibrs


***************************************************************************************************
lkp-cfl-d2: 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 32G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/need_x/option_a/rootfs/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-8.3/true/PutImage XY 500x500 Square/debian-x86_64-phoronix/lkp-cfl-d2/x11perf-1.1.1/phoronix-test-suite

commit: 
  16a9359401 ("drm/i915: Implement transcoder LRR for TGL+")
  54fef7ea35 ("drm/i915/gem: Allow users to disable waitboost")

16a9359401edcbc0 54fef7ea35dadd66193b98805b0 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     11950 ±  3%     +93.9%      23173 ±  3%  meminfo.Unevictable
      1.02 ±  3%      -0.6        0.43 ±  3%  mpstat.cpu.all.iowait%
     21273            -5.3%      20141        vmstat.system.in
      4887 ± 32%     -67.7%       1579 ± 32%  phoronix-test-suite.time.involuntary_context_switches
    147212           +26.1%     185677        phoronix-test-suite.time.minor_page_faults
     76.83           +10.6%      85.00        phoronix-test-suite.time.percent_of_cpu_this_job_got
     96.48 ±  4%     +15.0%     110.93 ±  2%  phoronix-test-suite.time.user_time
    106.83           +12.8%     120.50        phoronix-test-suite.x11perf.PutImageXY500x500Square.operations___second
      3.73 ±  3%      -0.6        3.12 ±  2%  turbostat.C1E%
     58316 ±  5%     -20.6%      46326 ±  3%  turbostat.C3
      0.86 ±  2%      -0.1        0.74 ±  3%  turbostat.C3%
      1.24 ±  8%     -18.9%       1.01 ± 10%  turbostat.CPU%c3
     38.67 ±  2%      +5.0       43.70 ±  2%  turbostat.CPUGFX%
     23.44            +4.0%      24.38        turbostat.CorWatt
      2.30 ±  4%     -71.6%       0.65 ±  2%  turbostat.GFXWatt
     26.19            -2.7%      25.49        turbostat.PkgWatt
      1.41            +2.4%       1.44        turbostat.RAMWatt
     61.83 ± 23%     -20.2       41.62 ± 31%  perf-profile.calltrace.cycles-pp.intel_idle_ibrs.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
     61.85 ± 23%     -20.2       41.64 ± 31%  perf-profile.children.cycles-pp.intel_idle_ibrs
      1.24 ± 59%      -1.0        0.23 ± 27%  perf-profile.children.cycles-pp.worker_thread
      0.84 ± 60%      -0.7        0.18 ± 48%  perf-profile.children.cycles-pp.asm_common_interrupt
      0.83 ± 61%      -0.7        0.18 ± 48%  perf-profile.children.cycles-pp.common_interrupt
      0.35 ± 74%      -0.3        0.06 ± 85%  perf-profile.children.cycles-pp.__common_interrupt
      0.35 ± 75%      -0.3        0.06 ± 85%  perf-profile.children.cycles-pp.handle_edge_irq
      0.30 ± 78%      -0.3        0.05 ± 81%  perf-profile.children.cycles-pp.handle_irq_event
      0.37 ± 18%      -0.2        0.18 ± 46%  perf-profile.children.cycles-pp.newidle_balance
      0.12 ± 32%      -0.1        0.04 ± 79%  perf-profile.children.cycles-pp.exit_mm
      0.12 ± 30%      -0.1        0.05 ± 73%  perf-profile.children.cycles-pp.native_apic_msr_eoi
      0.03 ±144%      +0.1        0.10 ± 38%  perf-profile.children.cycles-pp.dma_resv_iter_first_unlocked
      0.04 ±154%      +0.1        0.19 ± 40%  perf-profile.children.cycles-pp.i915_gem_busy_ioctl
     61.82 ± 23%     -20.2       41.63 ± 31%  perf-profile.self.cycles-pp.intel_idle_ibrs
      0.12 ± 26%      -0.1        0.05 ± 74%  perf-profile.self.cycles-pp.native_apic_msr_eoi
     31861            +5.9%      33748        proc-vmstat.nr_active_file
     29007            +7.6%      31200        proc-vmstat.nr_mapped
    183598            +1.6%     186520        proc-vmstat.nr_shmem
      2987 ±  3%     +93.9%       5793 ±  3%  proc-vmstat.nr_unevictable
     31861            +5.9%      33748        proc-vmstat.nr_zone_active_file
      2987 ±  3%     +93.9%       5793 ±  3%  proc-vmstat.nr_zone_unevictable
    721353 ±  2%     +13.3%     816958        proc-vmstat.numa_hit
    721406 ±  2%     +13.2%     816970        proc-vmstat.numa_local
     21301 ±  4%     +13.1%      24099        proc-vmstat.pgactivate
   2459642 ±  3%     +15.5%    2840123 ±  2%  proc-vmstat.pgalloc_normal
    530507 ±  3%      +8.0%     573060        proc-vmstat.pgfault
   2336660 ±  3%     +16.3%    2717873 ±  2%  proc-vmstat.pgfree
      9973 ±  2%    +143.6%      24292 ±  3%  proc-vmstat.unevictable_pgs_culled
      9875          +146.0%      24292 ±  3%  proc-vmstat.unevictable_pgs_rescued
      9876          +146.1%      24304 ±  3%  proc-vmstat.unevictable_pgs_scanned
 1.413e+09           +12.4%  1.588e+09        perf-stat.i.branch-instructions
      2.09            -0.1        1.99 ±  2%  perf-stat.i.branch-miss-rate%
  23305261            +1.8%   23715959        perf-stat.i.branch-misses
      4.57 ±  4%      +1.5        6.06 ±  2%  perf-stat.i.cache-miss-rate%
   4971151 ±  5%     +71.3%    8515308 ±  2%  perf-stat.i.cache-misses
 1.771e+08            +6.4%  1.884e+08        perf-stat.i.cache-references
      0.91            -3.5%       0.88 ±  2%  perf-stat.i.cpi
 4.936e+09            +7.8%  5.319e+09        perf-stat.i.cpu-cycles
     39.95 ±  3%     -15.1%      33.90 ±  2%  perf-stat.i.cpu-migrations
      1671 ±  4%     -34.5%       1095 ±  6%  perf-stat.i.cycles-between-cache-misses
 2.815e+09           +11.8%  3.147e+09        perf-stat.i.dTLB-loads
      0.02 ±  2%      -0.0        0.02 ±  4%  perf-stat.i.dTLB-store-miss-rate%
 6.478e+08           +10.4%  7.153e+08        perf-stat.i.dTLB-stores
     40.15 ±  2%      -2.9       37.29        perf-stat.i.iTLB-load-miss-rate%
 1.063e+10           +12.2%  1.192e+10        perf-stat.i.instructions
     40912 ±  4%     +25.5%      51361 ±  2%  perf-stat.i.instructions-per-iTLB-miss
      2.18            +2.7%       2.23        perf-stat.i.ipc
      0.41            +7.8%       0.44        perf-stat.i.metric.GHz
    421.03           +11.6%     469.88        perf-stat.i.metric.M/sec
      3358            +5.5%       3542        perf-stat.i.minor-faults
      0.00 ± 11%      -0.0        0.00 ±  6%  perf-stat.i.node-load-miss-rate%
      6.40 ±  8%     +17.6%       7.52 ±  5%  perf-stat.i.node-load-misses
    216221 ±  6%     +61.9%     350134        perf-stat.i.node-loads
      0.00 ± 16%      +0.1        0.12 ±220%  perf-stat.i.node-store-miss-rate%
      6.72 ±  6%  +6.9e+12%  4.615e+11 ±223%  perf-stat.i.node-store-misses
    694301 ±  9%     -14.3%     594881 ±  3%  perf-stat.i.node-stores
      3361            +5.5%       3545        perf-stat.i.page-faults
      0.47 ±  5%     +52.6%       0.71 ±  2%  perf-stat.overall.MPKI
      1.65            -0.2        1.49        perf-stat.overall.branch-miss-rate%
      2.81 ±  6%      +1.7        4.52 ±  2%  perf-stat.overall.cache-miss-rate%
      0.46            -4.0%       0.45        perf-stat.overall.cpi
    994.89 ±  5%     -37.2%     624.75 ±  2%  perf-stat.overall.cycles-between-cache-misses
      0.03            -0.0        0.02        perf-stat.overall.dTLB-load-miss-rate%
      0.01            -0.0        0.01        perf-stat.overall.dTLB-store-miss-rate%
     30.99 ±  3%      -2.3       28.68 ±  4%  perf-stat.overall.iTLB-load-miss-rate%
     25510 ± 11%     +20.2%      30672 ±  5%  perf-stat.overall.instructions-per-iTLB-miss
      2.15            +4.1%       2.24        perf-stat.overall.ipc
      0.00 ±  8%      -0.0        0.00 ±  5%  perf-stat.overall.node-load-miss-rate%
      0.00 ±  8%     +16.7       16.67 ±223%  perf-stat.overall.node-store-miss-rate%
 1.403e+09           +12.5%  1.578e+09        perf-stat.ps.branch-instructions
  23153579            +1.7%   23558574        perf-stat.ps.branch-misses
   4938806 ±  5%     +71.3%    8457880 ±  2%  perf-stat.ps.cache-misses
 1.758e+08            +6.4%  1.871e+08        perf-stat.ps.cache-references
 4.901e+09            +7.8%  5.282e+09        perf-stat.ps.cpu-cycles
     39.66 ±  3%     -15.1%      33.67 ±  2%  perf-stat.ps.cpu-migrations
 2.795e+09           +11.8%  3.125e+09        perf-stat.ps.dTLB-loads
 6.432e+08           +10.4%  7.103e+08        perf-stat.ps.dTLB-stores
 1.055e+10           +12.2%  1.184e+10        perf-stat.ps.instructions
      3336            +5.5%       3518        perf-stat.ps.minor-faults
      6.35 ±  8%     +17.6%       7.47 ±  5%  perf-stat.ps.node-load-misses
    214756 ±  6%     +61.9%     347761        perf-stat.ps.node-loads
      6.67 ±  6%  +6.9e+12%  4.577e+11 ±223%  perf-stat.ps.node-store-misses
    689455 ±  9%     -14.3%     590802 ±  3%  perf-stat.ps.node-stores
      3338            +5.5%       3521        perf-stat.ps.page-faults
 1.467e+12 ±  4%     +15.7%  1.698e+12 ±  2%  perf-stat.total.instructions



***************************************************************************************************
lkp-cfl-d2: 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 32G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/need_x/option_a/rootfs/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-8.3/true/2560 x 1440/debian-x86_64-phoronix/lkp-cfl-d2/openarena-1.5.5/phoronix-test-suite

commit: 
  16a9359401 ("drm/i915: Implement transcoder LRR for TGL+")
  54fef7ea35 ("drm/i915/gem: Allow users to disable waitboost")

16a9359401edcbc0 54fef7ea35dadd66193b98805b0 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      1459 ± 27%     +34.9%       1967 ± 13%  sched_debug.cpu.curr->pid.max
      1.63            -0.5        1.14 ±  2%  mpstat.cpu.all.iowait%
      1.17            -0.1        1.06        mpstat.cpu.all.irq%
    189.00 ± 11%     -53.6%      87.67 ± 17%  perf-c2c.DRAM.local
     20.67 ± 28%     -87.9%       2.50 ± 72%  perf-c2c.HITM.local
      6016            +5.7%       6358        vmstat.io.bi
     30168            -5.0%      28666        vmstat.system.cs
     27678            -2.7%      26933        vmstat.system.in
    337.58           +13.2%     382.30        phoronix-test-suite.openarena.2560x1440.frames_per_second
      2.95           -12.2%       2.59        phoronix-test-suite.openarena.2560x1440.milliseconds
    191369           +23.8%     236938        phoronix-test-suite.time.minor_page_faults
     51.00            +6.5%      54.33        phoronix-test-suite.time.percent_of_cpu_this_job_got
     56192           -63.1%      20758        phoronix-test-suite.time.voluntary_context_switches
     90640           -64.0%      32673 ±  7%  turbostat.C1
      0.68 ±  2%      -0.3        0.38 ±  5%  turbostat.C1%
     46385           -12.3%      40688 ±  2%  turbostat.C3
      1.72            -0.2        1.51 ±  2%  turbostat.C3%
      3.58 ±  6%     -17.6%       2.95 ± 11%  turbostat.CPU%c3
      5.71           -11.1%       5.08        turbostat.GFXWatt
     29983 ±  4%     -63.0%      11105 ± 24%  turbostat.POLL
      0.06 ±  8%      -0.0        0.01 ± 35%  turbostat.POLL%
      1.31 ± 15%     +70.2%       2.23 ±  7%  turbostat.Pkg%pc2
     29.56            -1.4%      29.14        turbostat.PkgWatt
      2123            -5.7%       2001        proc-vmstat.nr_active_anon
     37869            +2.8%      38912        proc-vmstat.nr_active_file
     22828            -5.8%      21501        proc-vmstat.nr_unevictable
      2123            -5.7%       2001        proc-vmstat.nr_zone_active_anon
     37869            +2.8%      38912        proc-vmstat.nr_zone_active_file
     22828            -5.8%      21501        proc-vmstat.nr_zone_unevictable
    624578            +4.2%     650937        proc-vmstat.numa_hit
    624578            +4.2%     650937        proc-vmstat.numa_local
    771624            +3.5%     798755        proc-vmstat.pgalloc_normal
    429393            +7.2%     460476        proc-vmstat.pgfault
    620059            +4.1%     645242        proc-vmstat.pgfree
      8.52 ±126%      -7.9        0.62 ±223%  perf-profile.calltrace.cycles-pp.wp_page_copy.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
      6.09 ±114%      -5.4        0.67 ±223%  perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
      3.76 ±110%      -3.8        0.00        perf-profile.calltrace.cycles-pp.error_entry
      8.52 ±126%      -7.9        0.62 ±223%  perf-profile.children.cycles-pp.wp_page_copy
      6.09 ±114%      -5.4        0.67 ±223%  perf-profile.children.cycles-pp.intel_idle
      5.79 ±100%      -4.6        1.24 ±223%  perf-profile.children.cycles-pp.do_filp_open
      5.79 ±100%      -4.6        1.24 ±223%  perf-profile.children.cycles-pp.path_openat
      3.76 ±110%      -3.8        0.00        perf-profile.children.cycles-pp.error_entry
      3.76 ±110%      -2.6        1.11 ±223%  perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
      3.76 ±110%      -2.6        1.11 ±223%  perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
      3.76 ±110%      -2.6        1.11 ±223%  perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
      3.76 ±110%      -2.6        1.11 ±223%  perf-profile.children.cycles-pp.hrtimer_interrupt
      6.09 ±114%      -5.4        0.67 ±223%  perf-profile.self.cycles-pp.intel_idle
      3.76 ±110%      -3.8        0.00        perf-profile.self.cycles-pp.error_entry
      4.97 ±109%      -2.2        2.78 ±223%  perf-profile.self.cycles-pp.zap_pte_range
  8.07e+08            +2.7%   8.29e+08        perf-stat.i.branch-instructions
  31612011            +4.7%   33095236        perf-stat.i.cache-misses
     31337            -4.0%      30085        perf-stat.i.context-switches
      1.57            +7.7%       1.69 ±  2%  perf-stat.i.cpi
 4.798e+09            +3.2%   4.95e+09        perf-stat.i.cpu-cycles
      0.14 ±  3%      +0.0        0.16 ±  3%  perf-stat.i.dTLB-load-miss-rate%
 1.131e+09            +4.1%  1.177e+09        perf-stat.i.dTLB-loads
      0.03 ±  2%      +0.0        0.03 ±  2%  perf-stat.i.dTLB-store-miss-rate%
    134590            +6.1%     142835        perf-stat.i.dTLB-store-misses
 5.907e+08            +4.5%  6.175e+08        perf-stat.i.dTLB-stores
   3858714            +6.5%    4109072        perf-stat.i.iTLB-loads
 4.646e+09            +3.6%  4.812e+09        perf-stat.i.instructions
      0.40            +3.2%       0.41        perf-stat.i.metric.GHz
    824.17            +6.4%     876.81        perf-stat.i.metric.K/sec
    224.04            +3.7%     232.30        perf-stat.i.metric.M/sec
      5577           +13.3%       6319        perf-stat.i.minor-faults
   1661010            +8.8%    1806501        perf-stat.i.node-loads
   4199874            +5.4%    4426929        perf-stat.i.node-stores
      5582           +13.3%       6325        perf-stat.i.page-faults
    151.84            -1.5%     149.61        perf-stat.overall.cycles-between-cache-misses
 7.953e+08            +2.6%   8.16e+08        perf-stat.ps.branch-instructions
  31131128            +4.6%   32563072        perf-stat.ps.cache-misses
     30863            -4.1%      29604        perf-stat.ps.context-switches
 4.727e+09            +3.1%  4.871e+09        perf-stat.ps.cpu-cycles
 1.114e+09            +4.0%  1.158e+09        perf-stat.ps.dTLB-loads
    132539            +6.0%     140536        perf-stat.ps.dTLB-store-misses
 5.818e+08            +4.4%  6.076e+08        perf-stat.ps.dTLB-stores
   3798756            +6.4%    4042216        perf-stat.ps.iTLB-loads
 4.577e+09            +3.5%  4.736e+09        perf-stat.ps.instructions
      5496           +13.2%       6220        perf-stat.ps.minor-faults
   1635550            +8.7%    1777301        perf-stat.ps.node-loads
   4135301            +5.3%    4355124        perf-stat.ps.node-stores
      5501           +13.2%       6226        perf-stat.ps.page-faults
 3.045e+11            -2.8%  2.961e+11        perf-stat.total.instructions





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


  parent reply	other threads:[~2023-09-26  2:59 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-20 21:56 [Intel-gfx] [PATCH] drm/i915/gem: Allow users to disable waitboost Vinay Belgaumkar
2023-09-20 21:56 ` Vinay Belgaumkar
2023-09-21  3:53 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
2023-09-21  3:53 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2023-09-21  4:14 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
2023-09-21 10:41 ` [Intel-gfx] [PATCH] " Tvrtko Ursulin
2023-09-27 19:34   ` Belgaumkar, Vinay
2023-09-28 12:48     ` Tvrtko Ursulin
2023-10-13 20:51       ` Rodrigo Vivi
2023-10-13 20:51         ` Rodrigo Vivi
2023-10-16  8:02         ` Tvrtko Ursulin
2023-10-16  8:02           ` Tvrtko Ursulin
2023-10-16 17:58           ` Rodrigo Vivi
2023-10-16 17:58             ` Rodrigo Vivi
2023-09-26  2:58 ` kernel test robot [this message]
2023-09-26  2:58   ` kernel test robot
2023-09-26  2:58   ` kernel test robot
2023-10-27  5:30 ` [Intel-gfx] " kernel test robot
2023-10-27  5:30   ` kernel test robot
2023-10-27  5:30   ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=202309261055.b74df987-oliver.sang@intel.com \
    --to=oliver.sang@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=feng.tang@intel.com \
    --cc=fengwei.yin@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=oe-lkp@lists.linux.dev \
    --cc=rodrigo.vivi@intel.com \
    --cc=vinay.belgaumkar@intel.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.