All of lore.kernel.org
 help / color / mirror / Atom feed
From: kernel test robot <oliver.sang@intel.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
	<linux-kernel@vger.kernel.org>, <x86@kernel.org>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	<oliver.sang@intel.com>
Subject: [tip:locking/futex] [futex]  cec199c5e3: will-it-scale.per_thread_ops 3.2% regression
Date: Fri, 16 May 2025 10:13:11 +0800	[thread overview]
Message-ID: <202505160923.2556b729-lkp@intel.com> (raw)



Hello,

kernel test robot noticed a 3.2% regression of will-it-scale.per_thread_ops on:


commit: cec199c5e39bde7191a08087cc3d002ccfab31ff ("futex: Implement FUTEX2_NUMA")
https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git locking/futex

[test failed on linux-next/master bdd609656ff5573db9ba1d26496a528bdd297cf2]

testcase: will-it-scale
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 256 threads 2 sockets GENUINE INTEL(R) XEON(R) (Sierra Forest) with 128G memory
parameters:

	nr_task: 100%
	mode: thread
	test: futex1
	cpufreq_governor: performance


In addition to that, the commit also has significant impact on the following tests:

+------------------+---------------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_process_ops  4.6% regression                   |
| test machine     | 384 threads 2 sockets Intel(R) Xeon(R) 6972P (Granite Rapids) with 128G memory  |
| test parameters  | cpufreq_governor=performance                                                    |
|                  | mode=process                                                                    |
|                  | nr_task=100%                                                                    |
|                  | test=futex4                                                                     |
+------------------+---------------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_thread_ops  3.4% regression                    |
| test machine     | 256 threads 2 sockets GENUINE INTEL(R) XEON(R) (Sierra Forest) with 128G memory |
| test parameters  | cpufreq_governor=performance                                                    |
|                  | mode=thread                                                                     |
|                  | nr_task=100%                                                                    |
|                  | test=futex2                                                                     |
+------------------+---------------------------------------------------------------------------------+


If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202505160923.2556b729-lkp@intel.com


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250516/202505160923.2556b729-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-9.4/thread/100%/debian-12-x86_64-20240206.cgz/lkp-srf-2sp1/futex1/will-it-scale

commit: 
  63e8595c06 ("futex: Allow to make the private hash immutable")
  cec199c5e3 ("futex: Implement FUTEX2_NUMA")

63e8595c060a1fef cec199c5e39bde7191a08087cc3 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
 1.079e+09 ± 58%     -45.0%  5.934e+08 ±  3%  cpuidle..time
      1.38 ± 53%      -0.7        0.64 ±  6%  mpstat.cpu.all.idle%
   4521392 ± 27%     -41.0%    2666361 ± 55%  numa-meminfo.node0.MemUsed
   3133949 ± 39%     +59.9%    5010749 ± 29%  numa-meminfo.node1.MemUsed
 1.224e+09            -3.2%  1.185e+09        will-it-scale.256.threads
   4780060            -3.2%    4627197        will-it-scale.per_thread_ops
 1.224e+09            -3.2%  1.185e+09        will-it-scale.workload
      0.04 ± 24%     -29.3%       0.03 ± 37%  perf-stat.i.major-faults
      3964 ±  2%      -3.6%       3821        perf-stat.i.minor-faults
      3964 ±  2%      -3.6%       3821        perf-stat.i.page-faults
    322627            +3.7%     334580        perf-stat.overall.path-length
      0.04 ± 24%     -30.0%       0.03 ± 36%  perf-stat.ps.major-faults
      3934 ±  2%      -3.8%       3785        perf-stat.ps.minor-faults
      3934 ±  2%      -3.8%       3785        perf-stat.ps.page-faults
      0.07 ± 26%     -39.6%       0.04 ± 36%  perf-sched.sch_delay.avg.ms.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
      0.11 ± 16%     +38.2%       0.15 ± 25%  perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
     66.89 ±  7%     -30.1%      46.79 ± 27%  perf-sched.wait_and_delay.avg.ms.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
     11.66 ± 90%    +124.7%      26.19 ± 27%  perf-sched.wait_and_delay.avg.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call
      0.22 ± 16%     +38.5%       0.30 ± 25%  perf-sched.wait_and_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
    414.50 ±  7%     +59.5%     661.00 ± 40%  perf-sched.wait_and_delay.count.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
     66.83 ±  7%     -30.0%      46.75 ± 27%  perf-sched.wait_time.avg.ms.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
     11.66 ± 90%    +124.7%      26.19 ± 27%  perf-sched.wait_time.avg.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call
      0.11 ± 16%     +38.2%       0.15 ± 25%  perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
     28.45            -3.5       24.98        perf-profile.calltrace.cycles-pp.get_user_pages_fast.get_futex_key.futex_wake.do_futex.__x64_sys_futex
     27.02            -3.3       23.74        perf-profile.calltrace.cycles-pp.gup_fast_fallback.get_user_pages_fast.get_futex_key.futex_wake.do_futex
     24.86            -2.4       22.46        perf-profile.calltrace.cycles-pp.gup_fast.gup_fast_fallback.get_user_pages_fast.get_futex_key.futex_wake
     22.10            -1.9       20.22        perf-profile.calltrace.cycles-pp.gup_fast_pgd_range.gup_fast.gup_fast_fallback.get_user_pages_fast.get_futex_key
     30.86            -1.0       29.84        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.syscall
     15.86            -0.3       15.57        perf-profile.calltrace.cycles-pp.gup_fast_pte_range.gup_fast_pgd_range.gup_fast.gup_fast_fallback.get_user_pages_fast
      7.01            -0.2        6.79        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
      3.97            -0.1        3.83        perf-profile.calltrace.cycles-pp.entry_SYSRETQ_unsafe_stack.syscall
      4.28            -0.1        4.14        perf-profile.calltrace.cycles-pp.try_get_folio.gup_fast_pte_range.gup_fast_pgd_range.gup_fast.gup_fast_fallback
      0.75            -0.1        0.62        perf-profile.calltrace.cycles-pp.is_valid_gup_args.get_user_pages_fast.get_futex_key.futex_wake.do_futex
      1.18            -0.1        1.08        perf-profile.calltrace.cycles-pp.___pte_offset_map.gup_fast_pte_range.gup_fast_pgd_range.gup_fast.gup_fast_fallback
      3.04            -0.1        2.95        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.syscall
      2.53            -0.0        2.50        perf-profile.calltrace.cycles-pp.testcase
      0.60            -0.0        0.58        perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.syscall
      0.65            -0.0        0.63        perf-profile.calltrace.cycles-pp.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     55.34            +1.5       56.83        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.syscall
     53.05            +1.6       54.61        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     43.57            +1.9       45.44        perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     42.42            +1.9       44.33        perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     41.62            +1.9       43.54        perf-profile.calltrace.cycles-pp.futex_wake.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
      2.05            +2.1        4.17 ±  8%  perf-profile.calltrace.cycles-pp.futex_hash.futex_wake.do_futex.__x64_sys_futex.do_syscall_64
      0.00            +3.6        3.63 ±  9%  perf-profile.calltrace.cycles-pp.__futex_hash.futex_hash.futex_wake.do_futex.__x64_sys_futex
     28.77            -3.7       25.06        perf-profile.children.cycles-pp.get_user_pages_fast
     27.11            -3.3       23.82        perf-profile.children.cycles-pp.gup_fast_fallback
     24.95            -2.4       22.56        perf-profile.children.cycles-pp.gup_fast
     22.15            -1.9       20.25        perf-profile.children.cycles-pp.gup_fast_pgd_range
     20.83            -0.7       20.14        perf-profile.children.cycles-pp.entry_SYSCALL_64
     16.00            -0.6       15.44        perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
     15.99            -0.3       15.68        perf-profile.children.cycles-pp.gup_fast_pte_range
      7.06            -0.2        6.84        perf-profile.children.cycles-pp.syscall_exit_to_user_mode
      0.77            -0.2        0.62        perf-profile.children.cycles-pp.is_valid_gup_args
      4.33            -0.1        4.18        perf-profile.children.cycles-pp.try_get_folio
      1.20            -0.1        1.10        perf-profile.children.cycles-pp.___pte_offset_map
      2.70            -0.1        2.62        perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
      2.03            -0.0        2.00        perf-profile.children.cycles-pp.testcase
      0.65            -0.0        0.63        perf-profile.children.cycles-pp.x64_sys_call
      0.65            -0.0        0.63        perf-profile.children.cycles-pp.syscall_return_via_sysret
      0.10            +0.0        0.11        perf-profile.children.cycles-pp.sysvec_thermal
      0.09            +0.0        0.10        perf-profile.children.cycles-pp.intel_thermal_interrupt
      0.10 ±  4%      +0.0        0.12 ±  4%  perf-profile.children.cycles-pp.asm_sysvec_thermal
     98.25            +0.0       98.30        perf-profile.children.cycles-pp.syscall
     55.48            +1.5       56.95        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     53.22            +1.5       54.76        perf-profile.children.cycles-pp.do_syscall_64
     43.57            +1.9       45.44        perf-profile.children.cycles-pp.__x64_sys_futex
     42.46            +1.9       44.36        perf-profile.children.cycles-pp.do_futex
     41.72            +1.9       43.64        perf-profile.children.cycles-pp.futex_wake
      2.09            +2.1        4.22 ±  8%  perf-profile.children.cycles-pp.futex_hash
      0.00            +3.7        3.67 ±  9%  perf-profile.children.cycles-pp.__futex_hash
      6.11            -1.6        4.52        perf-profile.self.cycles-pp.gup_fast_pgd_range
      2.04            -1.5        0.50 ±  5%  perf-profile.self.cycles-pp.futex_hash
      1.96            -0.7        1.29        perf-profile.self.cycles-pp.gup_fast_fallback
     15.96            -0.6       15.40        perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
      1.10            -0.5        0.62        perf-profile.self.cycles-pp.get_user_pages_fast
      2.71            -0.5        2.24        perf-profile.self.cycles-pp.gup_fast
     14.25            -0.5       13.78        perf-profile.self.cycles-pp.syscall
     10.14            -0.3        9.81        perf-profile.self.cycles-pp.entry_SYSCALL_64
      6.69            -0.2        6.48        perf-profile.self.cycles-pp.syscall_exit_to_user_mode
      4.85            -0.2        4.68        perf-profile.self.cycles-pp.futex_wake
      0.72            -0.1        0.58        perf-profile.self.cycles-pp.is_valid_gup_args
      4.31            -0.1        4.17        perf-profile.self.cycles-pp.try_get_folio
      1.98            -0.1        1.91        perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      1.12            -0.1        1.05        perf-profile.self.cycles-pp.___pte_offset_map
      2.06            -0.1        1.99        perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
      1.75            -0.1        1.70        perf-profile.self.cycles-pp.do_syscall_64
      1.10            -0.0        1.07        perf-profile.self.cycles-pp.__x64_sys_futex
      0.78            -0.0        0.75        perf-profile.self.cycles-pp.do_futex
      0.65            -0.0        0.63        perf-profile.self.cycles-pp.syscall_return_via_sysret
      0.60            -0.0        0.58        perf-profile.self.cycles-pp.x64_sys_call
      0.19            -0.0        0.18        perf-profile.self.cycles-pp.futex_hash_put
      0.05            +0.0        0.06        perf-profile.self.cycles-pp.intel_thermal_interrupt
      0.00            +3.7        3.65 ±  9%  perf-profile.self.cycles-pp.__futex_hash
      5.84            +3.7        9.49        perf-profile.self.cycles-pp.get_futex_key


***************************************************************************************************
lkp-gnr-2ap2: 384 threads 2 sockets Intel(R) Xeon(R) 6972P (Granite Rapids) with 128G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-9.4/process/100%/debian-12-x86_64-20240206.cgz/lkp-gnr-2ap2/futex4/will-it-scale

commit: 
  63e8595c06 ("futex: Allow to make the private hash immutable")
  cec199c5e3 ("futex: Implement FUTEX2_NUMA")

63e8595c060a1fef cec199c5e39bde7191a08087cc3 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      0.72 ±  4%      +0.1        0.82 ±  6%  mpstat.cpu.all.irq%
   2699383           -13.4%    2337183 ±  6%  numa-meminfo.node1.Shmem
    123578 ±102%    +116.6%     267679 ± 25%  numa-numastat.node1.other_node
    123578 ±102%    +116.6%     267679 ± 25%  numa-vmstat.node1.numa_other
 2.323e+09            -4.6%  2.216e+09        will-it-scale.384.processes
   6049881            -4.6%    5771879        will-it-scale.per_process_ops
 2.323e+09            -4.6%  2.216e+09        will-it-scale.workload
      2.14 ± 53%     -69.0%       0.66 ±149%  perf-sched.sch_delay.avg.ms.schedule_timeout.khugepaged_wait_work.khugepaged.kthread
      2.14 ± 53%     -69.0%       0.66 ±149%  perf-sched.sch_delay.max.ms.schedule_timeout.khugepaged_wait_work.khugepaged.kthread
    351.13 ±130%    +836.1%       3286 ± 49%  perf-sched.wait_and_delay.max.ms.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
      2.14 ± 53%     -68.9%       0.67 ±148%  perf-sched.wait_time.avg.ms.schedule_timeout.khugepaged_wait_work.khugepaged.kthread
      2.14 ± 53%     -68.9%       0.67 ±148%  perf-sched.wait_time.max.ms.schedule_timeout.khugepaged_wait_work.khugepaged.kthread
    342.14 ±135%    +860.6%       3286 ± 49%  perf-sched.wait_time.max.ms.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
 2.319e+11            +1.8%  2.361e+11        perf-stat.i.branch-instructions
      0.92 ±  4%      -4.3%       0.88        perf-stat.i.cpi
 1.473e+12            +2.2%  1.506e+12        perf-stat.i.instructions
      1.11            +2.2%       1.14        perf-stat.i.ipc
      0.89            -1.6%       0.88        perf-stat.overall.cpi
      1.12            +1.6%       1.14        perf-stat.overall.ipc
    193483            +6.3%     205704        perf-stat.overall.path-length
  2.31e+11            +1.8%  2.353e+11        perf-stat.ps.branch-instructions
 1.468e+12            +2.3%  1.501e+12        perf-stat.ps.instructions
 4.495e+14            +1.4%  4.559e+14        perf-stat.total.instructions
      3.43 ±  2%      -0.6        2.82 ±  2%  perf-profile.calltrace.cycles-pp.futex_q_unlock.futex_wait_setup.__futex_wait.futex_wait.do_futex
     28.24            -0.6       27.62        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.syscall
      9.46            -0.5        8.95        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
      3.36            -0.3        3.10        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.syscall
      1.76            -0.1        1.68 ±  2%  perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.syscall
      1.31            -0.1        1.23        perf-profile.calltrace.cycles-pp.testcase
      1.13            -0.1        1.06        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
      2.09            -0.0        2.04        perf-profile.calltrace.cycles-pp.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
      1.64            -0.0        1.61        perf-profile.calltrace.cycles-pp.futex_hash_put.futex_wait_setup.__futex_wait.futex_wait.do_futex
      0.72            -0.0        0.70        perf-profile.calltrace.cycles-pp.entry_SYSRETQ_unsafe_stack.syscall
      2.16            +0.5        2.63        perf-profile.calltrace.cycles-pp.get_futex_key.futex_wait_setup.__futex_wait.futex_wait.do_futex
      2.92            +0.8        3.74        perf-profile.calltrace.cycles-pp.futex_q_lock.futex_wait_setup.__futex_wait.futex_wait.do_futex
     68.21            +1.1       69.32        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.syscall
     65.61            +1.1       66.75        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     49.34            +1.8       51.14        perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     45.75            +2.0       47.78        perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     43.36            +2.1       45.44        perf-profile.calltrace.cycles-pp.futex_wait.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
      3.80 ±  2%      +2.1        5.94 ±  2%  perf-profile.calltrace.cycles-pp.futex_hash.futex_wait_setup.__futex_wait.futex_wait.do_futex
     38.92            +2.3       41.20        perf-profile.calltrace.cycles-pp.__futex_wait.futex_wait.do_futex.__x64_sys_futex.do_syscall_64
     29.49            +2.6       32.08        perf-profile.calltrace.cycles-pp.futex_wait_setup.__futex_wait.futex_wait.do_futex.__x64_sys_futex
      0.00            +4.5        4.50        perf-profile.calltrace.cycles-pp.__futex_hash.futex_hash.futex_wait_setup.__futex_wait.futex_wait
      3.59 ±  2%      -0.6        2.99        perf-profile.children.cycles-pp.futex_q_unlock
     10.00            -0.6        9.43        perf-profile.children.cycles-pp.syscall_exit_to_user_mode
     16.65            -0.5       16.17        perf-profile.children.cycles-pp.entry_SYSCALL_64
      6.34            -0.3        6.08        perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
      2.41            -0.2        2.23        perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
      1.44            -0.1        1.33        perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
      1.98            -0.1        1.89        perf-profile.children.cycles-pp.syscall_return_via_sysret
      1.51            -0.1        1.43        perf-profile.children.cycles-pp.testcase
      1.41            -0.1        1.36        perf-profile.children.cycles-pp.futex_hash_put
      2.34            -0.1        2.28        perf-profile.children.cycles-pp.x64_sys_call
      0.79            -0.0        0.74        perf-profile.children.cycles-pp.futex_setup_timer
      0.10 ±  3%      +0.0        0.12 ±  4%  perf-profile.children.cycles-pp.ktime_get_update_offsets_now
      0.16 ±  3%      +0.0        0.17 ±  2%  perf-profile.children.cycles-pp.sched_tick
      0.14 ±  9%      +0.1        0.20 ± 16%  perf-profile.children.cycles-pp.ktime_get
      0.14 ± 11%      +0.1        0.20 ± 14%  perf-profile.children.cycles-pp.clockevents_program_event
      0.34 ±  9%      +0.1        0.42 ± 12%  perf-profile.children.cycles-pp.update_process_times
      0.42 ±  8%      +0.1        0.52 ± 12%  perf-profile.children.cycles-pp.__hrtimer_run_queues
      0.42 ±  9%      +0.1        0.51 ± 13%  perf-profile.children.cycles-pp.tick_nohz_handler
      0.70 ±  7%      +0.2        0.86 ± 10%  perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
      0.74 ±  6%      +0.2        0.90 ± 10%  perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
      0.68 ±  7%      +0.2        0.84 ± 10%  perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
      0.68 ±  7%      +0.2        0.84 ± 10%  perf-profile.children.cycles-pp.hrtimer_interrupt
      2.34            +0.4        2.78        perf-profile.children.cycles-pp.get_futex_key
      3.07            +0.8        3.91        perf-profile.children.cycles-pp.futex_q_lock
     66.52            +1.1       67.64        perf-profile.children.cycles-pp.do_syscall_64
     68.64            +1.1       69.76        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     49.87            +1.8       51.64        perf-profile.children.cycles-pp.__x64_sys_futex
     46.54            +2.0       48.55        perf-profile.children.cycles-pp.do_futex
     43.64            +2.1       45.70        perf-profile.children.cycles-pp.futex_wait
     39.38            +2.3       41.65        perf-profile.children.cycles-pp.__futex_wait
      3.95 ±  2%      +2.4        6.32 ±  2%  perf-profile.children.cycles-pp.futex_hash
     31.16            +2.6       33.80        perf-profile.children.cycles-pp.futex_wait_setup
      0.00            +4.7        4.74        perf-profile.children.cycles-pp.__futex_hash
      3.76 ±  2%      -2.2        1.56 ±  9%  perf-profile.self.cycles-pp.futex_hash
      3.36 ±  2%      -0.6        2.80 ±  2%  perf-profile.self.cycles-pp.futex_q_unlock
      8.55            -0.5        8.08        perf-profile.self.cycles-pp.syscall_exit_to_user_mode
      8.11            -0.4        7.75        perf-profile.self.cycles-pp.__futex_wait
     15.92            -0.4       15.56        perf-profile.self.cycles-pp.syscall
     14.12            -0.3       13.79        perf-profile.self.cycles-pp.futex_wait_setup
      4.08            -0.3        3.80        perf-profile.self.cycles-pp.entry_SYSCALL_64
      6.12            -0.2        5.88        perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
      3.32            -0.2        3.09        perf-profile.self.cycles-pp.__x64_sys_futex
      3.46            -0.2        3.29        perf-profile.self.cycles-pp.futex_wait
      4.49            -0.1        4.38        perf-profile.self.cycles-pp.do_syscall_64
      1.98            -0.1        1.89        perf-profile.self.cycles-pp.syscall_return_via_sysret
      1.42            -0.1        1.33        perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
      1.31            -0.1        1.23        perf-profile.self.cycles-pp.testcase
      1.14            -0.1        1.07        perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
      3.15            -0.1        3.08        perf-profile.self.cycles-pp.do_futex
      2.07            -0.1        2.02        perf-profile.self.cycles-pp.x64_sys_call
      0.81            -0.0        0.76        perf-profile.self.cycles-pp.futex_hash_put
      0.53            -0.0        0.50        perf-profile.self.cycles-pp.futex_setup_timer
      0.10            +0.0        0.12 ±  4%  perf-profile.self.cycles-pp.ktime_get_update_offsets_now
      0.14 ± 10%      +0.1        0.19 ± 16%  perf-profile.self.cycles-pp.ktime_get
      3.69            +0.1        3.74        perf-profile.self.cycles-pp._raw_spin_lock
      2.16            +0.4        2.60        perf-profile.self.cycles-pp.get_futex_key
      2.72            +0.7        3.42        perf-profile.self.cycles-pp.futex_q_lock
      0.00            +4.5        4.52        perf-profile.self.cycles-pp.__futex_hash



***************************************************************************************************
lkp-srf-2sp1: 256 threads 2 sockets GENUINE INTEL(R) XEON(R) (Sierra Forest) with 128G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-9.4/thread/100%/debian-12-x86_64-20240206.cgz/lkp-srf-2sp1/futex2/will-it-scale

commit: 
  63e8595c06 ("futex: Allow to make the private hash immutable")
  cec199c5e3 ("futex: Implement FUTEX2_NUMA")

63e8595c060a1fef cec199c5e39bde7191a08087cc3 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     41.17 ±  8%     -21.1%      32.50 ± 15%  perf-sched.wait_and_delay.count.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
     22531 ±  3%     +13.6%      25603 ± 18%  proc-vmstat.numa_pages_migrated
     95467 ± 22%     +48.6%     141903 ± 19%  proc-vmstat.numa_pte_updates
     22531 ±  3%     +13.6%      25603 ± 18%  proc-vmstat.pgmigrate_success
 9.212e+08            -3.4%  8.901e+08        will-it-scale.256.threads
   3598280            -3.4%    3477060        will-it-scale.per_thread_ops
 9.212e+08            -3.4%  8.901e+08        will-it-scale.workload
   7870684 ± 34%    +203.6%   23897639 ± 30%  perf-stat.i.branch-misses
     41.81 ± 55%     -20.4       21.40 ± 90%  perf-stat.i.cache-miss-rate%
      0.00 ± 34%      +0.0        0.01 ± 30%  perf-stat.overall.branch-miss-rate%
     33.72 ± 44%     -15.4       18.36 ± 69%  perf-stat.overall.cache-miss-rate%
    364549            +3.4%     376857        perf-stat.overall.path-length
   7813564 ± 34%    +204.1%   23764749 ± 30%  perf-stat.ps.branch-misses
     21.56            -2.4       19.17        perf-profile.calltrace.cycles-pp.get_user_pages_fast.get_futex_key.futex_wait_setup.__futex_wait.futex_wait
     20.55            -2.3       18.23        perf-profile.calltrace.cycles-pp.gup_fast_fallback.get_user_pages_fast.get_futex_key.futex_wait_setup.__futex_wait
     18.89            -1.6       17.27        perf-profile.calltrace.cycles-pp.gup_fast.gup_fast_fallback.get_user_pages_fast.get_futex_key.futex_wait_setup
     16.79            -1.2       15.55        perf-profile.calltrace.cycles-pp.gup_fast_pgd_range.gup_fast.gup_fast_fallback.get_user_pages_fast.get_futex_key
     23.20            -0.8       22.41        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.syscall
     12.14            -0.6       11.58        perf-profile.calltrace.cycles-pp.gup_fast_pte_range.gup_fast_pgd_range.gup_fast.gup_fast_fallback.get_user_pages_fast
      2.72 ±  2%      -0.2        2.56        perf-profile.calltrace.cycles-pp.futex_q_unlock.futex_wait_setup.__futex_wait.futex_wait.do_futex
      4.52            -0.1        4.37        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
      2.91 ±  2%      -0.1        2.76 ±  2%  perf-profile.calltrace.cycles-pp._raw_spin_lock.futex_wait_setup.__futex_wait.futex_wait.do_futex
      3.20            -0.1        3.10        perf-profile.calltrace.cycles-pp.try_get_folio.gup_fast_pte_range.gup_fast_pgd_range.gup_fast.gup_fast_fallback
      0.88            -0.1        0.78        perf-profile.calltrace.cycles-pp.___pte_offset_map.gup_fast_pte_range.gup_fast_pgd_range.gup_fast.gup_fast_fallback
      2.98            -0.1        2.88        perf-profile.calltrace.cycles-pp.entry_SYSRETQ_unsafe_stack.syscall
      2.58            -0.1        2.50        perf-profile.calltrace.cycles-pp.futex_q_lock.futex_wait_setup.__futex_wait.futex_wait.do_futex
      1.80            -0.1        1.74        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.syscall
     26.00            +0.4       26.36        perf-profile.calltrace.cycles-pp.get_futex_key.futex_wait_setup.__futex_wait.futex_wait.do_futex
     65.92            +1.1       67.00        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.syscall
     64.16            +1.1       65.30        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     57.75            +1.4       59.10        perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     56.92            +1.4       58.29        perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     56.20            +1.4       57.60        perf-profile.calltrace.cycles-pp.futex_wait.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
     54.70            +1.5       56.16        perf-profile.calltrace.cycles-pp.__futex_wait.futex_wait.do_futex.__x64_sys_futex.do_syscall_64
     46.65            +1.6       48.28        perf-profile.calltrace.cycles-pp.futex_wait_setup.__futex_wait.futex_wait.do_futex.__x64_sys_futex
      1.72 ±  7%      +1.8        3.55 ±  3%  perf-profile.calltrace.cycles-pp.futex_hash.futex_wait_setup.__futex_wait.futex_wait.do_futex
      0.00            +3.0        3.01        perf-profile.calltrace.cycles-pp.__futex_hash.futex_hash.futex_wait_setup.__futex_wait.futex_wait
     21.79            -2.6       19.24        perf-profile.children.cycles-pp.get_user_pages_fast
     20.62            -2.3       18.30        perf-profile.children.cycles-pp.gup_fast_fallback
     18.96            -1.6       17.34        perf-profile.children.cycles-pp.gup_fast
     16.82            -1.2       15.58        perf-profile.children.cycles-pp.gup_fast_pgd_range
     12.21            -0.5       11.66        perf-profile.children.cycles-pp.gup_fast_pte_range
     15.66            -0.5       15.13        perf-profile.children.cycles-pp.entry_SYSCALL_64
     12.01            -0.4       11.62        perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
      2.76 ±  2%      -0.2        2.60        perf-profile.children.cycles-pp.futex_q_unlock
      4.59            -0.2        4.43        perf-profile.children.cycles-pp.syscall_exit_to_user_mode
      2.95 ±  2%      -0.1        2.80 ±  2%  perf-profile.children.cycles-pp._raw_spin_lock
      0.59            -0.1        0.47        perf-profile.children.cycles-pp.is_valid_gup_args
      0.91            -0.1        0.80        perf-profile.children.cycles-pp.___pte_offset_map
      3.24            -0.1        3.13        perf-profile.children.cycles-pp.try_get_folio
      2.62            -0.1        2.53        perf-profile.children.cycles-pp.futex_q_lock
      2.03            -0.1        1.96        perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
      0.41            -0.0        0.39        perf-profile.children.cycles-pp.try_grab_folio_fast
      0.50            -0.0        0.49        perf-profile.children.cycles-pp.testcase
      0.49            -0.0        0.47        perf-profile.children.cycles-pp.x64_sys_call
      0.49            -0.0        0.47        perf-profile.children.cycles-pp.syscall_return_via_sysret
      0.15            -0.0        0.14        perf-profile.children.cycles-pp.futex_setup_timer
      0.09            +0.0        0.10 ±  3%  perf-profile.children.cycles-pp.__sysvec_thermal
      0.09            +0.0        0.10 ±  3%  perf-profile.children.cycles-pp.intel_thermal_interrupt
     26.06            +0.4       26.43        perf-profile.children.cycles-pp.get_futex_key
     66.09            +1.1       67.16        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     64.36            +1.1       65.48        perf-profile.children.cycles-pp.do_syscall_64
     57.75            +1.4       59.10        perf-profile.children.cycles-pp.__x64_sys_futex
     56.92            +1.4       58.29        perf-profile.children.cycles-pp.do_futex
     56.22            +1.4       57.62        perf-profile.children.cycles-pp.futex_wait
     55.22            +1.4       56.67        perf-profile.children.cycles-pp.__futex_wait
     46.26            +1.7       48.00        perf-profile.children.cycles-pp.futex_wait_setup
      1.75 ±  7%      +1.8        3.57 ±  3%  perf-profile.children.cycles-pp.futex_hash
      0.00            +3.0        3.04        perf-profile.children.cycles-pp.__futex_hash
      1.71 ±  7%      -1.2        0.50 ± 23%  perf-profile.self.cycles-pp.futex_hash
      4.58            -0.7        3.88 ±  2%  perf-profile.self.cycles-pp.gup_fast_pgd_range
      1.49            -0.5        0.98 ±  2%  perf-profile.self.cycles-pp.gup_fast_fallback
     11.97            -0.4       11.59        perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
      2.07            -0.4        1.69        perf-profile.self.cycles-pp.gup_fast
     11.78            -0.3       11.46        perf-profile.self.cycles-pp.syscall
      9.43            -0.3        9.12        perf-profile.self.cycles-pp.__futex_wait
      0.76            -0.3        0.47        perf-profile.self.cycles-pp.get_user_pages_fast
      7.34            -0.3        7.08        perf-profile.self.cycles-pp.gup_fast_pte_range
      7.64            -0.3        7.38        perf-profile.self.cycles-pp.entry_SYSCALL_64
      2.72 ±  2%      -0.2        2.56        perf-profile.self.cycles-pp.futex_q_unlock
      0.58            -0.1        0.44        perf-profile.self.cycles-pp.is_valid_gup_args
      4.20            -0.1        4.06        perf-profile.self.cycles-pp.syscall_exit_to_user_mode
      0.88            -0.1        0.76        perf-profile.self.cycles-pp.___pte_offset_map
      3.23            -0.1        3.11        perf-profile.self.cycles-pp.try_get_folio
      2.61            -0.1        2.52        perf-profile.self.cycles-pp.futex_q_lock
      1.53            -0.1        1.48        perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      1.54            -0.1        1.49        perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
      1.39            -0.0        1.35        perf-profile.self.cycles-pp.do_syscall_64
      0.41            -0.0        0.37        perf-profile.self.cycles-pp.try_grab_folio_fast
      0.86            -0.0        0.83        perf-profile.self.cycles-pp.futex_wait
      0.83            -0.0        0.80        perf-profile.self.cycles-pp.__x64_sys_futex
      0.70            -0.0        0.67        perf-profile.self.cycles-pp.do_futex
      0.45            -0.0        0.44        perf-profile.self.cycles-pp.x64_sys_call
      0.47            -0.0        0.45        perf-profile.self.cycles-pp.testcase
      0.45            -0.0        0.44        perf-profile.self.cycles-pp.syscall_return_via_sysret
      4.24            +2.9        7.17        perf-profile.self.cycles-pp.get_futex_key
      0.00            +3.0        3.03        perf-profile.self.cycles-pp.__futex_hash





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


                 reply	other threads:[~2025-05-16  2:13 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=202505160923.2556b729-lkp@intel.com \
    --to=oliver.sang@intel.com \
    --cc=bigeasy@linutronix.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@intel.com \
    --cc=oe-lkp@lists.linux.dev \
    --cc=peterz@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.