All of lore.kernel.org
 help / color / mirror / Atom feed
* [linus:master] [futex]  cec199c5e3:  will-it-scale.per_process_ops 4.7% regression
@ 2025-06-03 14:16 kernel test robot
  0 siblings, 0 replies; only message in thread
From: kernel test robot @ 2025-06-03 14:16 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: oe-lkp, lkp, linux-kernel, Sebastian Andrzej Siewior, oliver.sang



Hello,

kernel test robot noticed a 4.7% regression of will-it-scale.per_process_ops on:


commit: cec199c5e39bde7191a08087cc3d002ccfab31ff ("futex: Implement FUTEX2_NUMA")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

[still regression on linus/master      0f70f5b08a47a3bc1a252e5f451a137cde7c98ce]
[still regression on linux-next/master 3a83b350b5be4b4f6bd895eecf9a92080200ee5d]

testcase: will-it-scale
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 8 threads 1 sockets Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz (Haswell) with 8G memory
parameters:

	nr_task: 100%
	mode: process
	test: futex3
	cpufreq_governor: performance


In addition to that, the commit also has significant impact on the following tests:

+------------------+---------------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_thread_ops  3.4% regression                    |
| test machine     | 256 threads 2 sockets GENUINE INTEL(R) XEON(R) (Sierra Forest) with 128G memory |
| test parameters  | cpufreq_governor=performance                                                    |
|                  | mode=thread                                                                     |
|                  | nr_task=100%                                                                    |
|                  | test=futex2                                                                     |
+------------------+---------------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_process_ops  4.6% regression                   |
| test machine     | 384 threads 2 sockets Intel(R) Xeon(R) 6972P (Granite Rapids) with 128G memory  |
| test parameters  | cpufreq_governor=performance                                                    |
|                  | mode=process                                                                    |
|                  | nr_task=100%                                                                    |
|                  | test=futex4                                                                     |
+------------------+---------------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_thread_ops  3.2% regression                    |
| test machine     | 256 threads 2 sockets GENUINE INTEL(R) XEON(R) (Sierra Forest) with 128G memory |
| test parameters  | cpufreq_governor=performance                                                    |
|                  | mode=thread                                                                     |
|                  | nr_task=100%                                                                    |
|                  | test=futex1                                                                     |
+------------------+---------------------------------------------------------------------------------+


If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202506032136.fb9b9db5-lkp@intel.com


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250603/202506032136.fb9b9db5-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-9.4/process/100%/debian-12-x86_64-20240206.cgz/lkp-hsw-d01/futex3/will-it-scale

commit: 
  63e8595c06 ("futex: Allow to make the private hash immutable")
  cec199c5e3 ("futex: Implement FUTEX2_NUMA")

63e8595c060a1fef cec199c5e39bde7191a08087cc3 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    661671 ± 12%     -34.6%     432677 ± 43%  sched_debug.cpu.avg_idle.min
      0.03 ± 17%     -29.5%       0.02 ± 16%  perf-sched.sch_delay.avg.ms.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
    433.76 ± 12%     +47.9%     641.70 ± 18%  perf-sched.wait_and_delay.avg.ms.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep
    433.58 ± 12%     +47.9%     641.30 ± 18%  perf-sched.wait_time.avg.ms.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep
  12631979            -4.7%   12040371        will-it-scale.8.processes
   1578997            -4.7%    1505046        will-it-scale.per_process_ops
  12631979            -4.7%   12040371        will-it-scale.workload
 1.151e+09            +2.2%  1.177e+09        perf-stat.i.branch-instructions
      4.16            -2.7%       4.05        perf-stat.i.cpi
 7.188e+09            +2.7%  7.379e+09        perf-stat.i.instructions
      0.25            +2.6%       0.25        perf-stat.i.ipc
      0.78            -0.0        0.77        perf-stat.overall.branch-miss-rate%
      4.10            -2.6%       3.99        perf-stat.overall.cpi
      0.24            +2.7%       0.25        perf-stat.overall.ipc
    171339            +7.6%     184338        perf-stat.overall.path-length
 1.148e+09            +2.2%  1.173e+09        perf-stat.ps.branch-instructions
 7.164e+09            +2.7%  7.355e+09        perf-stat.ps.instructions
 2.164e+12            +2.5%   2.22e+12        perf-stat.total.instructions
     11.89 ±  7%      -3.3        8.57 ±  3%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.syscall
     34.29            -3.3       30.97        perf-profile.calltrace.cycles-pp.entry_SYSRETQ_unsafe_stack.syscall
     13.49            -0.6       12.90        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.syscall
     14.07            -0.5       13.59        perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.syscall
      2.24            -0.1        2.14        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
      0.35 ± 70%      +0.2        0.57        perf-profile.calltrace.cycles-pp.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
      1.33 ±  4%      +1.6        2.90        perf-profile.calltrace.cycles-pp.get_futex_key.futex_wake.do_futex.__x64_sys_futex.do_syscall_64
      6.39 ±  4%      +2.7        9.09        perf-profile.calltrace.cycles-pp.futex_hash.futex_wake.do_futex.__x64_sys_futex.do_syscall_64
     13.20 ±  2%      +3.5       16.68        perf-profile.calltrace.cycles-pp.futex_wake.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
     14.12 ±  2%      +4.1       18.24        perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     15.54 ±  2%      +4.5       20.00        perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     20.06            +7.0       27.09        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
      0.00            +7.5        7.46        perf-profile.calltrace.cycles-pp.__futex_hash.futex_hash.futex_wake.do_futex.__x64_sys_futex
     21.50            +7.7       29.17        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.syscall
     37.35            -3.4       33.90        perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
      8.68 ±  7%      -2.2        6.46 ±  3%  perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
     14.75            -1.6       13.12        perf-profile.children.cycles-pp.entry_SYSCALL_64
     14.34            -0.5       13.87        perf-profile.children.cycles-pp.syscall_return_via_sysret
      0.53 ±  4%      -0.1        0.47 ±  3%  perf-profile.children.cycles-pp.futex_hash_put
      0.58            +0.1        0.65 ±  2%  perf-profile.children.cycles-pp.x64_sys_call
      1.35 ±  4%      +1.6        2.94        perf-profile.children.cycles-pp.get_futex_key
      6.40 ±  4%      +2.7        9.14        perf-profile.children.cycles-pp.futex_hash
     13.50 ±  2%      +3.5       17.00        perf-profile.children.cycles-pp.futex_wake
     14.28 ±  2%      +4.1       18.38        perf-profile.children.cycles-pp.do_futex
     15.61 ±  2%      +4.5       20.12        perf-profile.children.cycles-pp.__x64_sys_futex
     20.33            +7.0       27.36        perf-profile.children.cycles-pp.do_syscall_64
      0.00            +7.5        7.46        perf-profile.children.cycles-pp.__futex_hash
     21.75            +7.7       29.41        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
      6.26 ±  4%      -4.7        1.54        perf-profile.self.cycles-pp.futex_hash
     37.08            -3.4       33.63        perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
     12.73            -1.6       11.17        perf-profile.self.cycles-pp.entry_SYSCALL_64
      5.40 ±  7%      -1.1        4.30 ±  3%  perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
      7.82            -1.1        6.76        perf-profile.self.cycles-pp.syscall
      5.31 ±  3%      -0.7        4.59        perf-profile.self.cycles-pp.futex_wake
     14.34            -0.5       13.86        perf-profile.self.cycles-pp.syscall_return_via_sysret
      1.89            -0.1        1.80        perf-profile.self.cycles-pp.syscall_exit_to_user_mode
      0.51 ±  4%      -0.1        0.46 ±  3%  perf-profile.self.cycles-pp.futex_hash_put
      0.49            +0.1        0.58        perf-profile.self.cycles-pp.x64_sys_call
      1.30 ±  2%      +0.4        1.75 ±  2%  perf-profile.self.cycles-pp.__x64_sys_futex
      0.85 ±  2%      +0.6        1.40 ±  2%  perf-profile.self.cycles-pp.do_futex
      1.47 ±  2%      +0.6        2.11 ±  2%  perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      1.29 ±  3%      +1.6        2.88        perf-profile.self.cycles-pp.get_futex_key
      1.74            +2.6        4.31 ±  2%  perf-profile.self.cycles-pp.do_syscall_64
      0.00            +7.4        7.40        perf-profile.self.cycles-pp.__futex_hash


***************************************************************************************************
lkp-srf-2sp1: 256 threads 2 sockets GENUINE INTEL(R) XEON(R) (Sierra Forest) with 128G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-9.4/thread/100%/debian-12-x86_64-20240206.cgz/lkp-srf-2sp1/futex2/will-it-scale

commit: 
  63e8595c06 ("futex: Allow to make the private hash immutable")
  cec199c5e3 ("futex: Implement FUTEX2_NUMA")

63e8595c060a1fef cec199c5e39bde7191a08087cc3 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     41.17 ±  8%     -21.1%      32.50 ± 15%  perf-sched.wait_and_delay.count.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
     22531 ±  3%     +13.6%      25603 ± 18%  proc-vmstat.numa_pages_migrated
     95467 ± 22%     +48.6%     141903 ± 19%  proc-vmstat.numa_pte_updates
     22531 ±  3%     +13.6%      25603 ± 18%  proc-vmstat.pgmigrate_success
 9.212e+08            -3.4%  8.901e+08        will-it-scale.256.threads
   3598280            -3.4%    3477060        will-it-scale.per_thread_ops
 9.212e+08            -3.4%  8.901e+08        will-it-scale.workload
   7870684 ± 34%    +203.6%   23897639 ± 30%  perf-stat.i.branch-misses
     41.81 ± 55%     -20.4       21.40 ± 90%  perf-stat.i.cache-miss-rate%
      0.00 ± 34%      +0.0        0.01 ± 30%  perf-stat.overall.branch-miss-rate%
     33.72 ± 44%     -15.4       18.36 ± 69%  perf-stat.overall.cache-miss-rate%
    364549            +3.4%     376857        perf-stat.overall.path-length
   7813564 ± 34%    +204.1%   23764749 ± 30%  perf-stat.ps.branch-misses
     21.56            -2.4       19.17        perf-profile.calltrace.cycles-pp.get_user_pages_fast.get_futex_key.futex_wait_setup.__futex_wait.futex_wait
     20.55            -2.3       18.23        perf-profile.calltrace.cycles-pp.gup_fast_fallback.get_user_pages_fast.get_futex_key.futex_wait_setup.__futex_wait
     18.89            -1.6       17.27        perf-profile.calltrace.cycles-pp.gup_fast.gup_fast_fallback.get_user_pages_fast.get_futex_key.futex_wait_setup
     16.79            -1.2       15.55        perf-profile.calltrace.cycles-pp.gup_fast_pgd_range.gup_fast.gup_fast_fallback.get_user_pages_fast.get_futex_key
     23.20            -0.8       22.41        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.syscall
     12.14            -0.6       11.58        perf-profile.calltrace.cycles-pp.gup_fast_pte_range.gup_fast_pgd_range.gup_fast.gup_fast_fallback.get_user_pages_fast
      2.72 ±  2%      -0.2        2.56        perf-profile.calltrace.cycles-pp.futex_q_unlock.futex_wait_setup.__futex_wait.futex_wait.do_futex
      4.52            -0.1        4.37        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
      2.91 ±  2%      -0.1        2.76 ±  2%  perf-profile.calltrace.cycles-pp._raw_spin_lock.futex_wait_setup.__futex_wait.futex_wait.do_futex
      3.20            -0.1        3.10        perf-profile.calltrace.cycles-pp.try_get_folio.gup_fast_pte_range.gup_fast_pgd_range.gup_fast.gup_fast_fallback
      0.88            -0.1        0.78        perf-profile.calltrace.cycles-pp.___pte_offset_map.gup_fast_pte_range.gup_fast_pgd_range.gup_fast.gup_fast_fallback
      2.98            -0.1        2.88        perf-profile.calltrace.cycles-pp.entry_SYSRETQ_unsafe_stack.syscall
      2.58            -0.1        2.50        perf-profile.calltrace.cycles-pp.futex_q_lock.futex_wait_setup.__futex_wait.futex_wait.do_futex
      1.80            -0.1        1.74        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.syscall
     26.00            +0.4       26.36        perf-profile.calltrace.cycles-pp.get_futex_key.futex_wait_setup.__futex_wait.futex_wait.do_futex
     65.92            +1.1       67.00        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.syscall
     64.16            +1.1       65.30        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     57.75            +1.4       59.10        perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     56.92            +1.4       58.29        perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     56.20            +1.4       57.60        perf-profile.calltrace.cycles-pp.futex_wait.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
     54.70            +1.5       56.16        perf-profile.calltrace.cycles-pp.__futex_wait.futex_wait.do_futex.__x64_sys_futex.do_syscall_64
     46.65            +1.6       48.28        perf-profile.calltrace.cycles-pp.futex_wait_setup.__futex_wait.futex_wait.do_futex.__x64_sys_futex
      1.72 ±  7%      +1.8        3.55 ±  3%  perf-profile.calltrace.cycles-pp.futex_hash.futex_wait_setup.__futex_wait.futex_wait.do_futex
      0.00            +3.0        3.01        perf-profile.calltrace.cycles-pp.__futex_hash.futex_hash.futex_wait_setup.__futex_wait.futex_wait
     21.79            -2.6       19.24        perf-profile.children.cycles-pp.get_user_pages_fast
     20.62            -2.3       18.30        perf-profile.children.cycles-pp.gup_fast_fallback
     18.96            -1.6       17.34        perf-profile.children.cycles-pp.gup_fast
     16.82            -1.2       15.58        perf-profile.children.cycles-pp.gup_fast_pgd_range
     12.21            -0.5       11.66        perf-profile.children.cycles-pp.gup_fast_pte_range
     15.66            -0.5       15.13        perf-profile.children.cycles-pp.entry_SYSCALL_64
     12.01            -0.4       11.62        perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
      2.76 ±  2%      -0.2        2.60        perf-profile.children.cycles-pp.futex_q_unlock
      4.59            -0.2        4.43        perf-profile.children.cycles-pp.syscall_exit_to_user_mode
      2.95 ±  2%      -0.1        2.80 ±  2%  perf-profile.children.cycles-pp._raw_spin_lock
      0.59            -0.1        0.47        perf-profile.children.cycles-pp.is_valid_gup_args
      0.91            -0.1        0.80        perf-profile.children.cycles-pp.___pte_offset_map
      3.24            -0.1        3.13        perf-profile.children.cycles-pp.try_get_folio
      2.62            -0.1        2.53        perf-profile.children.cycles-pp.futex_q_lock
      2.03            -0.1        1.96        perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
      0.41            -0.0        0.39        perf-profile.children.cycles-pp.try_grab_folio_fast
      0.50            -0.0        0.49        perf-profile.children.cycles-pp.testcase
      0.49            -0.0        0.47        perf-profile.children.cycles-pp.x64_sys_call
      0.49            -0.0        0.47        perf-profile.children.cycles-pp.syscall_return_via_sysret
      0.15            -0.0        0.14        perf-profile.children.cycles-pp.futex_setup_timer
      0.09            +0.0        0.10 ±  3%  perf-profile.children.cycles-pp.__sysvec_thermal
      0.09            +0.0        0.10 ±  3%  perf-profile.children.cycles-pp.intel_thermal_interrupt
     26.06            +0.4       26.43        perf-profile.children.cycles-pp.get_futex_key
     66.09            +1.1       67.16        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     64.36            +1.1       65.48        perf-profile.children.cycles-pp.do_syscall_64
     57.75            +1.4       59.10        perf-profile.children.cycles-pp.__x64_sys_futex
     56.92            +1.4       58.29        perf-profile.children.cycles-pp.do_futex
     56.22            +1.4       57.62        perf-profile.children.cycles-pp.futex_wait
     55.22            +1.4       56.67        perf-profile.children.cycles-pp.__futex_wait
     46.26            +1.7       48.00        perf-profile.children.cycles-pp.futex_wait_setup
      1.75 ±  7%      +1.8        3.57 ±  3%  perf-profile.children.cycles-pp.futex_hash
      0.00            +3.0        3.04        perf-profile.children.cycles-pp.__futex_hash
      1.71 ±  7%      -1.2        0.50 ± 23%  perf-profile.self.cycles-pp.futex_hash
      4.58            -0.7        3.88 ±  2%  perf-profile.self.cycles-pp.gup_fast_pgd_range
      1.49            -0.5        0.98 ±  2%  perf-profile.self.cycles-pp.gup_fast_fallback
     11.97            -0.4       11.59        perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
      2.07            -0.4        1.69        perf-profile.self.cycles-pp.gup_fast
     11.78            -0.3       11.46        perf-profile.self.cycles-pp.syscall
      9.43            -0.3        9.12        perf-profile.self.cycles-pp.__futex_wait
      0.76            -0.3        0.47        perf-profile.self.cycles-pp.get_user_pages_fast
      7.34            -0.3        7.08        perf-profile.self.cycles-pp.gup_fast_pte_range
      7.64            -0.3        7.38        perf-profile.self.cycles-pp.entry_SYSCALL_64
      2.72 ±  2%      -0.2        2.56        perf-profile.self.cycles-pp.futex_q_unlock
      0.58            -0.1        0.44        perf-profile.self.cycles-pp.is_valid_gup_args
      4.20            -0.1        4.06        perf-profile.self.cycles-pp.syscall_exit_to_user_mode
      0.88            -0.1        0.76        perf-profile.self.cycles-pp.___pte_offset_map
      3.23            -0.1        3.11        perf-profile.self.cycles-pp.try_get_folio
      2.61            -0.1        2.52        perf-profile.self.cycles-pp.futex_q_lock
      1.53            -0.1        1.48        perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      1.54            -0.1        1.49        perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
      1.39            -0.0        1.35        perf-profile.self.cycles-pp.do_syscall_64
      0.41            -0.0        0.37        perf-profile.self.cycles-pp.try_grab_folio_fast
      0.86            -0.0        0.83        perf-profile.self.cycles-pp.futex_wait
      0.83            -0.0        0.80        perf-profile.self.cycles-pp.__x64_sys_futex
      0.70            -0.0        0.67        perf-profile.self.cycles-pp.do_futex
      0.45            -0.0        0.44        perf-profile.self.cycles-pp.x64_sys_call
      0.47            -0.0        0.45        perf-profile.self.cycles-pp.testcase
      0.45            -0.0        0.44        perf-profile.self.cycles-pp.syscall_return_via_sysret
      4.24            +2.9        7.17        perf-profile.self.cycles-pp.get_futex_key
      0.00            +3.0        3.03        perf-profile.self.cycles-pp.__futex_hash



***************************************************************************************************
lkp-gnr-2ap2: 384 threads 2 sockets Intel(R) Xeon(R) 6972P (Granite Rapids) with 128G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-9.4/process/100%/debian-12-x86_64-20240206.cgz/lkp-gnr-2ap2/futex4/will-it-scale

commit: 
  63e8595c06 ("futex: Allow to make the private hash immutable")
  cec199c5e3 ("futex: Implement FUTEX2_NUMA")

63e8595c060a1fef cec199c5e39bde7191a08087cc3 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      0.72 ±  4%      +0.1        0.82 ±  6%  mpstat.cpu.all.irq%
   2699383           -13.4%    2337183 ±  6%  numa-meminfo.node1.Shmem
    123578 ±102%    +116.6%     267679 ± 25%  numa-numastat.node1.other_node
    123578 ±102%    +116.6%     267679 ± 25%  numa-vmstat.node1.numa_other
 2.323e+09            -4.6%  2.216e+09        will-it-scale.384.processes
   6049881            -4.6%    5771879        will-it-scale.per_process_ops
 2.323e+09            -4.6%  2.216e+09        will-it-scale.workload
      2.14 ± 53%     -69.0%       0.66 ±149%  perf-sched.sch_delay.avg.ms.schedule_timeout.khugepaged_wait_work.khugepaged.kthread
      2.14 ± 53%     -69.0%       0.66 ±149%  perf-sched.sch_delay.max.ms.schedule_timeout.khugepaged_wait_work.khugepaged.kthread
    351.13 ±130%    +836.1%       3286 ± 49%  perf-sched.wait_and_delay.max.ms.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
      2.14 ± 53%     -68.9%       0.67 ±148%  perf-sched.wait_time.avg.ms.schedule_timeout.khugepaged_wait_work.khugepaged.kthread
      2.14 ± 53%     -68.9%       0.67 ±148%  perf-sched.wait_time.max.ms.schedule_timeout.khugepaged_wait_work.khugepaged.kthread
    342.14 ±135%    +860.6%       3286 ± 49%  perf-sched.wait_time.max.ms.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
 2.319e+11            +1.8%  2.361e+11        perf-stat.i.branch-instructions
      0.92 ±  4%      -4.3%       0.88        perf-stat.i.cpi
 1.473e+12            +2.2%  1.506e+12        perf-stat.i.instructions
      1.11            +2.2%       1.14        perf-stat.i.ipc
      0.89            -1.6%       0.88        perf-stat.overall.cpi
      1.12            +1.6%       1.14        perf-stat.overall.ipc
    193483            +6.3%     205704        perf-stat.overall.path-length
  2.31e+11            +1.8%  2.353e+11        perf-stat.ps.branch-instructions
 1.468e+12            +2.3%  1.501e+12        perf-stat.ps.instructions
 4.495e+14            +1.4%  4.559e+14        perf-stat.total.instructions
      3.43 ±  2%      -0.6        2.82 ±  2%  perf-profile.calltrace.cycles-pp.futex_q_unlock.futex_wait_setup.__futex_wait.futex_wait.do_futex
     28.24            -0.6       27.62        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.syscall
      9.46            -0.5        8.95        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
      3.36            -0.3        3.10        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.syscall
      1.76            -0.1        1.68 ±  2%  perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.syscall
      1.31            -0.1        1.23        perf-profile.calltrace.cycles-pp.testcase
      1.13            -0.1        1.06        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
      2.09            -0.0        2.04        perf-profile.calltrace.cycles-pp.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
      1.64            -0.0        1.61        perf-profile.calltrace.cycles-pp.futex_hash_put.futex_wait_setup.__futex_wait.futex_wait.do_futex
      0.72            -0.0        0.70        perf-profile.calltrace.cycles-pp.entry_SYSRETQ_unsafe_stack.syscall
      2.16            +0.5        2.63        perf-profile.calltrace.cycles-pp.get_futex_key.futex_wait_setup.__futex_wait.futex_wait.do_futex
      2.92            +0.8        3.74        perf-profile.calltrace.cycles-pp.futex_q_lock.futex_wait_setup.__futex_wait.futex_wait.do_futex
     68.21            +1.1       69.32        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.syscall
     65.61            +1.1       66.75        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     49.34            +1.8       51.14        perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     45.75            +2.0       47.78        perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     43.36            +2.1       45.44        perf-profile.calltrace.cycles-pp.futex_wait.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
      3.80 ±  2%      +2.1        5.94 ±  2%  perf-profile.calltrace.cycles-pp.futex_hash.futex_wait_setup.__futex_wait.futex_wait.do_futex
     38.92            +2.3       41.20        perf-profile.calltrace.cycles-pp.__futex_wait.futex_wait.do_futex.__x64_sys_futex.do_syscall_64
     29.49            +2.6       32.08        perf-profile.calltrace.cycles-pp.futex_wait_setup.__futex_wait.futex_wait.do_futex.__x64_sys_futex
      0.00            +4.5        4.50        perf-profile.calltrace.cycles-pp.__futex_hash.futex_hash.futex_wait_setup.__futex_wait.futex_wait
      3.59 ±  2%      -0.6        2.99        perf-profile.children.cycles-pp.futex_q_unlock
     10.00            -0.6        9.43        perf-profile.children.cycles-pp.syscall_exit_to_user_mode
     16.65            -0.5       16.17        perf-profile.children.cycles-pp.entry_SYSCALL_64
      6.34            -0.3        6.08        perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
      2.41            -0.2        2.23        perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
      1.44            -0.1        1.33        perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
      1.98            -0.1        1.89        perf-profile.children.cycles-pp.syscall_return_via_sysret
      1.51            -0.1        1.43        perf-profile.children.cycles-pp.testcase
      1.41            -0.1        1.36        perf-profile.children.cycles-pp.futex_hash_put
      2.34            -0.1        2.28        perf-profile.children.cycles-pp.x64_sys_call
      0.79            -0.0        0.74        perf-profile.children.cycles-pp.futex_setup_timer
      0.10 ±  3%      +0.0        0.12 ±  4%  perf-profile.children.cycles-pp.ktime_get_update_offsets_now
      0.16 ±  3%      +0.0        0.17 ±  2%  perf-profile.children.cycles-pp.sched_tick
      0.14 ±  9%      +0.1        0.20 ± 16%  perf-profile.children.cycles-pp.ktime_get
      0.14 ± 11%      +0.1        0.20 ± 14%  perf-profile.children.cycles-pp.clockevents_program_event
      0.34 ±  9%      +0.1        0.42 ± 12%  perf-profile.children.cycles-pp.update_process_times
      0.42 ±  8%      +0.1        0.52 ± 12%  perf-profile.children.cycles-pp.__hrtimer_run_queues
      0.42 ±  9%      +0.1        0.51 ± 13%  perf-profile.children.cycles-pp.tick_nohz_handler
      0.70 ±  7%      +0.2        0.86 ± 10%  perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
      0.74 ±  6%      +0.2        0.90 ± 10%  perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
      0.68 ±  7%      +0.2        0.84 ± 10%  perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
      0.68 ±  7%      +0.2        0.84 ± 10%  perf-profile.children.cycles-pp.hrtimer_interrupt
      2.34            +0.4        2.78        perf-profile.children.cycles-pp.get_futex_key
      3.07            +0.8        3.91        perf-profile.children.cycles-pp.futex_q_lock
     66.52            +1.1       67.64        perf-profile.children.cycles-pp.do_syscall_64
     68.64            +1.1       69.76        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     49.87            +1.8       51.64        perf-profile.children.cycles-pp.__x64_sys_futex
     46.54            +2.0       48.55        perf-profile.children.cycles-pp.do_futex
     43.64            +2.1       45.70        perf-profile.children.cycles-pp.futex_wait
     39.38            +2.3       41.65        perf-profile.children.cycles-pp.__futex_wait
      3.95 ±  2%      +2.4        6.32 ±  2%  perf-profile.children.cycles-pp.futex_hash
     31.16            +2.6       33.80        perf-profile.children.cycles-pp.futex_wait_setup
      0.00            +4.7        4.74        perf-profile.children.cycles-pp.__futex_hash
      3.76 ±  2%      -2.2        1.56 ±  9%  perf-profile.self.cycles-pp.futex_hash
      3.36 ±  2%      -0.6        2.80 ±  2%  perf-profile.self.cycles-pp.futex_q_unlock
      8.55            -0.5        8.08        perf-profile.self.cycles-pp.syscall_exit_to_user_mode
      8.11            -0.4        7.75        perf-profile.self.cycles-pp.__futex_wait
     15.92            -0.4       15.56        perf-profile.self.cycles-pp.syscall
     14.12            -0.3       13.79        perf-profile.self.cycles-pp.futex_wait_setup
      4.08            -0.3        3.80        perf-profile.self.cycles-pp.entry_SYSCALL_64
      6.12            -0.2        5.88        perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
      3.32            -0.2        3.09        perf-profile.self.cycles-pp.__x64_sys_futex
      3.46            -0.2        3.29        perf-profile.self.cycles-pp.futex_wait
      4.49            -0.1        4.38        perf-profile.self.cycles-pp.do_syscall_64
      1.98            -0.1        1.89        perf-profile.self.cycles-pp.syscall_return_via_sysret
      1.42            -0.1        1.33        perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
      1.31            -0.1        1.23        perf-profile.self.cycles-pp.testcase
      1.14            -0.1        1.07        perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
      3.15            -0.1        3.08        perf-profile.self.cycles-pp.do_futex
      2.07            -0.1        2.02        perf-profile.self.cycles-pp.x64_sys_call
      0.81            -0.0        0.76        perf-profile.self.cycles-pp.futex_hash_put
      0.53            -0.0        0.50        perf-profile.self.cycles-pp.futex_setup_timer
      0.10            +0.0        0.12 ±  4%  perf-profile.self.cycles-pp.ktime_get_update_offsets_now
      0.14 ± 10%      +0.1        0.19 ± 16%  perf-profile.self.cycles-pp.ktime_get
      3.69            +0.1        3.74        perf-profile.self.cycles-pp._raw_spin_lock
      2.16            +0.4        2.60        perf-profile.self.cycles-pp.get_futex_key
      2.72            +0.7        3.42        perf-profile.self.cycles-pp.futex_q_lock
      0.00            +4.5        4.52        perf-profile.self.cycles-pp.__futex_hash



***************************************************************************************************
lkp-srf-2sp1: 256 threads 2 sockets GENUINE INTEL(R) XEON(R) (Sierra Forest) with 128G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-9.4/thread/100%/debian-12-x86_64-20240206.cgz/lkp-srf-2sp1/futex1/will-it-scale

commit: 
  63e8595c06 ("futex: Allow to make the private hash immutable")
  cec199c5e3 ("futex: Implement FUTEX2_NUMA")

63e8595c060a1fef cec199c5e39bde7191a08087cc3 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
 1.079e+09 ± 58%     -45.0%  5.934e+08 ±  3%  cpuidle..time
      1.38 ± 53%      -0.7        0.64 ±  6%  mpstat.cpu.all.idle%
   4521392 ± 27%     -41.0%    2666361 ± 55%  numa-meminfo.node0.MemUsed
   3133949 ± 39%     +59.9%    5010749 ± 29%  numa-meminfo.node1.MemUsed
 1.224e+09            -3.2%  1.185e+09        will-it-scale.256.threads
   4780060            -3.2%    4627197        will-it-scale.per_thread_ops
 1.224e+09            -3.2%  1.185e+09        will-it-scale.workload
      0.04 ± 24%     -29.3%       0.03 ± 37%  perf-stat.i.major-faults
      3964 ±  2%      -3.6%       3821        perf-stat.i.minor-faults
      3964 ±  2%      -3.6%       3821        perf-stat.i.page-faults
    322627            +3.7%     334580        perf-stat.overall.path-length
      0.04 ± 24%     -30.0%       0.03 ± 36%  perf-stat.ps.major-faults
      3934 ±  2%      -3.8%       3785        perf-stat.ps.minor-faults
      3934 ±  2%      -3.8%       3785        perf-stat.ps.page-faults
      0.07 ± 26%     -39.6%       0.04 ± 36%  perf-sched.sch_delay.avg.ms.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
      0.11 ± 16%     +38.2%       0.15 ± 25%  perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
     66.89 ±  7%     -30.1%      46.79 ± 27%  perf-sched.wait_and_delay.avg.ms.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
     11.66 ± 90%    +124.7%      26.19 ± 27%  perf-sched.wait_and_delay.avg.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call
      0.22 ± 16%     +38.5%       0.30 ± 25%  perf-sched.wait_and_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
    414.50 ±  7%     +59.5%     661.00 ± 40%  perf-sched.wait_and_delay.count.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
     66.83 ±  7%     -30.0%      46.75 ± 27%  perf-sched.wait_time.avg.ms.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
     11.66 ± 90%    +124.7%      26.19 ± 27%  perf-sched.wait_time.avg.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call
      0.11 ± 16%     +38.2%       0.15 ± 25%  perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
     28.45            -3.5       24.98        perf-profile.calltrace.cycles-pp.get_user_pages_fast.get_futex_key.futex_wake.do_futex.__x64_sys_futex
     27.02            -3.3       23.74        perf-profile.calltrace.cycles-pp.gup_fast_fallback.get_user_pages_fast.get_futex_key.futex_wake.do_futex
     24.86            -2.4       22.46        perf-profile.calltrace.cycles-pp.gup_fast.gup_fast_fallback.get_user_pages_fast.get_futex_key.futex_wake
     22.10            -1.9       20.22        perf-profile.calltrace.cycles-pp.gup_fast_pgd_range.gup_fast.gup_fast_fallback.get_user_pages_fast.get_futex_key
     30.86            -1.0       29.84        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.syscall
     15.86            -0.3       15.57        perf-profile.calltrace.cycles-pp.gup_fast_pte_range.gup_fast_pgd_range.gup_fast.gup_fast_fallback.get_user_pages_fast
      7.01            -0.2        6.79        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
      3.97            -0.1        3.83        perf-profile.calltrace.cycles-pp.entry_SYSRETQ_unsafe_stack.syscall
      4.28            -0.1        4.14        perf-profile.calltrace.cycles-pp.try_get_folio.gup_fast_pte_range.gup_fast_pgd_range.gup_fast.gup_fast_fallback
      0.75            -0.1        0.62        perf-profile.calltrace.cycles-pp.is_valid_gup_args.get_user_pages_fast.get_futex_key.futex_wake.do_futex
      1.18            -0.1        1.08        perf-profile.calltrace.cycles-pp.___pte_offset_map.gup_fast_pte_range.gup_fast_pgd_range.gup_fast.gup_fast_fallback
      3.04            -0.1        2.95        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.syscall
      2.53            -0.0        2.50        perf-profile.calltrace.cycles-pp.testcase
      0.60            -0.0        0.58        perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.syscall
      0.65            -0.0        0.63        perf-profile.calltrace.cycles-pp.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     55.34            +1.5       56.83        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.syscall
     53.05            +1.6       54.61        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     43.57            +1.9       45.44        perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     42.42            +1.9       44.33        perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     41.62            +1.9       43.54        perf-profile.calltrace.cycles-pp.futex_wake.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
      2.05            +2.1        4.17 ±  8%  perf-profile.calltrace.cycles-pp.futex_hash.futex_wake.do_futex.__x64_sys_futex.do_syscall_64
      0.00            +3.6        3.63 ±  9%  perf-profile.calltrace.cycles-pp.__futex_hash.futex_hash.futex_wake.do_futex.__x64_sys_futex
     28.77            -3.7       25.06        perf-profile.children.cycles-pp.get_user_pages_fast
     27.11            -3.3       23.82        perf-profile.children.cycles-pp.gup_fast_fallback
     24.95            -2.4       22.56        perf-profile.children.cycles-pp.gup_fast
     22.15            -1.9       20.25        perf-profile.children.cycles-pp.gup_fast_pgd_range
     20.83            -0.7       20.14        perf-profile.children.cycles-pp.entry_SYSCALL_64
     16.00            -0.6       15.44        perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
     15.99            -0.3       15.68        perf-profile.children.cycles-pp.gup_fast_pte_range
      7.06            -0.2        6.84        perf-profile.children.cycles-pp.syscall_exit_to_user_mode
      0.77            -0.2        0.62        perf-profile.children.cycles-pp.is_valid_gup_args
      4.33            -0.1        4.18        perf-profile.children.cycles-pp.try_get_folio
      1.20            -0.1        1.10        perf-profile.children.cycles-pp.___pte_offset_map
      2.70            -0.1        2.62        perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
      2.03            -0.0        2.00        perf-profile.children.cycles-pp.testcase
      0.65            -0.0        0.63        perf-profile.children.cycles-pp.x64_sys_call
      0.65            -0.0        0.63        perf-profile.children.cycles-pp.syscall_return_via_sysret
      0.10            +0.0        0.11        perf-profile.children.cycles-pp.sysvec_thermal
      0.09            +0.0        0.10        perf-profile.children.cycles-pp.intel_thermal_interrupt
      0.10 ±  4%      +0.0        0.12 ±  4%  perf-profile.children.cycles-pp.asm_sysvec_thermal
     98.25            +0.0       98.30        perf-profile.children.cycles-pp.syscall
     55.48            +1.5       56.95        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     53.22            +1.5       54.76        perf-profile.children.cycles-pp.do_syscall_64
     43.57            +1.9       45.44        perf-profile.children.cycles-pp.__x64_sys_futex
     42.46            +1.9       44.36        perf-profile.children.cycles-pp.do_futex
     41.72            +1.9       43.64        perf-profile.children.cycles-pp.futex_wake
      2.09            +2.1        4.22 ±  8%  perf-profile.children.cycles-pp.futex_hash
      0.00            +3.7        3.67 ±  9%  perf-profile.children.cycles-pp.__futex_hash
      6.11            -1.6        4.52        perf-profile.self.cycles-pp.gup_fast_pgd_range
      2.04            -1.5        0.50 ±  5%  perf-profile.self.cycles-pp.futex_hash
      1.96            -0.7        1.29        perf-profile.self.cycles-pp.gup_fast_fallback
     15.96            -0.6       15.40        perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
      1.10            -0.5        0.62        perf-profile.self.cycles-pp.get_user_pages_fast
      2.71            -0.5        2.24        perf-profile.self.cycles-pp.gup_fast
     14.25            -0.5       13.78        perf-profile.self.cycles-pp.syscall
     10.14            -0.3        9.81        perf-profile.self.cycles-pp.entry_SYSCALL_64
      6.69            -0.2        6.48        perf-profile.self.cycles-pp.syscall_exit_to_user_mode
      4.85            -0.2        4.68        perf-profile.self.cycles-pp.futex_wake
      0.72            -0.1        0.58        perf-profile.self.cycles-pp.is_valid_gup_args
      4.31            -0.1        4.17        perf-profile.self.cycles-pp.try_get_folio
      1.98            -0.1        1.91        perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      1.12            -0.1        1.05        perf-profile.self.cycles-pp.___pte_offset_map
      2.06            -0.1        1.99        perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
      1.75            -0.1        1.70        perf-profile.self.cycles-pp.do_syscall_64
      1.10            -0.0        1.07        perf-profile.self.cycles-pp.__x64_sys_futex
      0.78            -0.0        0.75        perf-profile.self.cycles-pp.do_futex
      0.65            -0.0        0.63        perf-profile.self.cycles-pp.syscall_return_via_sysret
      0.60            -0.0        0.58        perf-profile.self.cycles-pp.x64_sys_call
      0.19            -0.0        0.18        perf-profile.self.cycles-pp.futex_hash_put
      0.05            +0.0        0.06        perf-profile.self.cycles-pp.intel_thermal_interrupt
      0.00            +3.7        3.65 ±  9%  perf-profile.self.cycles-pp.__futex_hash
      5.84            +3.7        9.49        perf-profile.self.cycles-pp.get_futex_key





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2025-06-03 14:17 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-03 14:16 [linus:master] [futex] cec199c5e3: will-it-scale.per_process_ops 4.7% regression kernel test robot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.