* [tip:locking/futex] [futex] cec199c5e3: will-it-scale.per_thread_ops 3.2% regression
@ 2025-05-16 2:13 kernel test robot
0 siblings, 0 replies; only message in thread
From: kernel test robot @ 2025-05-16 2:13 UTC (permalink / raw)
To: Peter Zijlstra
Cc: oe-lkp, lkp, linux-kernel, x86, Sebastian Andrzej Siewior,
oliver.sang
Hello,
kernel test robot noticed a 3.2% regression of will-it-scale.per_thread_ops on:
commit: cec199c5e39bde7191a08087cc3d002ccfab31ff ("futex: Implement FUTEX2_NUMA")
https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git locking/futex
[test failed on linux-next/master bdd609656ff5573db9ba1d26496a528bdd297cf2]
testcase: will-it-scale
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 256 threads 2 sockets GENUINE INTEL(R) XEON(R) (Sierra Forest) with 128G memory
parameters:
nr_task: 100%
mode: thread
test: futex1
cpufreq_governor: performance
In addition to that, the commit also has significant impact on the following tests:
+------------------+---------------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_process_ops 4.6% regression |
| test machine | 384 threads 2 sockets Intel(R) Xeon(R) 6972P (Granite Rapids) with 128G memory |
| test parameters | cpufreq_governor=performance |
| | mode=process |
| | nr_task=100% |
| | test=futex4 |
+------------------+---------------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_thread_ops 3.4% regression |
| test machine | 256 threads 2 sockets GENUINE INTEL(R) XEON(R) (Sierra Forest) with 128G memory |
| test parameters | cpufreq_governor=performance |
| | mode=thread |
| | nr_task=100% |
| | test=futex2 |
+------------------+---------------------------------------------------------------------------------+
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202505160923.2556b729-lkp@intel.com
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250516/202505160923.2556b729-lkp@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-12/performance/x86_64-rhel-9.4/thread/100%/debian-12-x86_64-20240206.cgz/lkp-srf-2sp1/futex1/will-it-scale
commit:
63e8595c06 ("futex: Allow to make the private hash immutable")
cec199c5e3 ("futex: Implement FUTEX2_NUMA")
63e8595c060a1fef cec199c5e39bde7191a08087cc3
---------------- ---------------------------
%stddev %change %stddev
\ | \
1.079e+09 ± 58% -45.0% 5.934e+08 ± 3% cpuidle..time
1.38 ± 53% -0.7 0.64 ± 6% mpstat.cpu.all.idle%
4521392 ± 27% -41.0% 2666361 ± 55% numa-meminfo.node0.MemUsed
3133949 ± 39% +59.9% 5010749 ± 29% numa-meminfo.node1.MemUsed
1.224e+09 -3.2% 1.185e+09 will-it-scale.256.threads
4780060 -3.2% 4627197 will-it-scale.per_thread_ops
1.224e+09 -3.2% 1.185e+09 will-it-scale.workload
0.04 ± 24% -29.3% 0.03 ± 37% perf-stat.i.major-faults
3964 ± 2% -3.6% 3821 perf-stat.i.minor-faults
3964 ± 2% -3.6% 3821 perf-stat.i.page-faults
322627 +3.7% 334580 perf-stat.overall.path-length
0.04 ± 24% -30.0% 0.03 ± 36% perf-stat.ps.major-faults
3934 ± 2% -3.8% 3785 perf-stat.ps.minor-faults
3934 ± 2% -3.8% 3785 perf-stat.ps.page-faults
0.07 ± 26% -39.6% 0.04 ± 36% perf-sched.sch_delay.avg.ms.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
0.11 ± 16% +38.2% 0.15 ± 25% perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
66.89 ± 7% -30.1% 46.79 ± 27% perf-sched.wait_and_delay.avg.ms.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
11.66 ± 90% +124.7% 26.19 ± 27% perf-sched.wait_and_delay.avg.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call
0.22 ± 16% +38.5% 0.30 ± 25% perf-sched.wait_and_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
414.50 ± 7% +59.5% 661.00 ± 40% perf-sched.wait_and_delay.count.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
66.83 ± 7% -30.0% 46.75 ± 27% perf-sched.wait_time.avg.ms.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
11.66 ± 90% +124.7% 26.19 ± 27% perf-sched.wait_time.avg.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call
0.11 ± 16% +38.2% 0.15 ± 25% perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
28.45 -3.5 24.98 perf-profile.calltrace.cycles-pp.get_user_pages_fast.get_futex_key.futex_wake.do_futex.__x64_sys_futex
27.02 -3.3 23.74 perf-profile.calltrace.cycles-pp.gup_fast_fallback.get_user_pages_fast.get_futex_key.futex_wake.do_futex
24.86 -2.4 22.46 perf-profile.calltrace.cycles-pp.gup_fast.gup_fast_fallback.get_user_pages_fast.get_futex_key.futex_wake
22.10 -1.9 20.22 perf-profile.calltrace.cycles-pp.gup_fast_pgd_range.gup_fast.gup_fast_fallback.get_user_pages_fast.get_futex_key
30.86 -1.0 29.84 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.syscall
15.86 -0.3 15.57 perf-profile.calltrace.cycles-pp.gup_fast_pte_range.gup_fast_pgd_range.gup_fast.gup_fast_fallback.get_user_pages_fast
7.01 -0.2 6.79 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
3.97 -0.1 3.83 perf-profile.calltrace.cycles-pp.entry_SYSRETQ_unsafe_stack.syscall
4.28 -0.1 4.14 perf-profile.calltrace.cycles-pp.try_get_folio.gup_fast_pte_range.gup_fast_pgd_range.gup_fast.gup_fast_fallback
0.75 -0.1 0.62 perf-profile.calltrace.cycles-pp.is_valid_gup_args.get_user_pages_fast.get_futex_key.futex_wake.do_futex
1.18 -0.1 1.08 perf-profile.calltrace.cycles-pp.___pte_offset_map.gup_fast_pte_range.gup_fast_pgd_range.gup_fast.gup_fast_fallback
3.04 -0.1 2.95 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.syscall
2.53 -0.0 2.50 perf-profile.calltrace.cycles-pp.testcase
0.60 -0.0 0.58 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.syscall
0.65 -0.0 0.63 perf-profile.calltrace.cycles-pp.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
55.34 +1.5 56.83 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.syscall
53.05 +1.6 54.61 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
43.57 +1.9 45.44 perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
42.42 +1.9 44.33 perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
41.62 +1.9 43.54 perf-profile.calltrace.cycles-pp.futex_wake.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
2.05 +2.1 4.17 ± 8% perf-profile.calltrace.cycles-pp.futex_hash.futex_wake.do_futex.__x64_sys_futex.do_syscall_64
0.00 +3.6 3.63 ± 9% perf-profile.calltrace.cycles-pp.__futex_hash.futex_hash.futex_wake.do_futex.__x64_sys_futex
28.77 -3.7 25.06 perf-profile.children.cycles-pp.get_user_pages_fast
27.11 -3.3 23.82 perf-profile.children.cycles-pp.gup_fast_fallback
24.95 -2.4 22.56 perf-profile.children.cycles-pp.gup_fast
22.15 -1.9 20.25 perf-profile.children.cycles-pp.gup_fast_pgd_range
20.83 -0.7 20.14 perf-profile.children.cycles-pp.entry_SYSCALL_64
16.00 -0.6 15.44 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
15.99 -0.3 15.68 perf-profile.children.cycles-pp.gup_fast_pte_range
7.06 -0.2 6.84 perf-profile.children.cycles-pp.syscall_exit_to_user_mode
0.77 -0.2 0.62 perf-profile.children.cycles-pp.is_valid_gup_args
4.33 -0.1 4.18 perf-profile.children.cycles-pp.try_get_folio
1.20 -0.1 1.10 perf-profile.children.cycles-pp.___pte_offset_map
2.70 -0.1 2.62 perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
2.03 -0.0 2.00 perf-profile.children.cycles-pp.testcase
0.65 -0.0 0.63 perf-profile.children.cycles-pp.x64_sys_call
0.65 -0.0 0.63 perf-profile.children.cycles-pp.syscall_return_via_sysret
0.10 +0.0 0.11 perf-profile.children.cycles-pp.sysvec_thermal
0.09 +0.0 0.10 perf-profile.children.cycles-pp.intel_thermal_interrupt
0.10 ± 4% +0.0 0.12 ± 4% perf-profile.children.cycles-pp.asm_sysvec_thermal
98.25 +0.0 98.30 perf-profile.children.cycles-pp.syscall
55.48 +1.5 56.95 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
53.22 +1.5 54.76 perf-profile.children.cycles-pp.do_syscall_64
43.57 +1.9 45.44 perf-profile.children.cycles-pp.__x64_sys_futex
42.46 +1.9 44.36 perf-profile.children.cycles-pp.do_futex
41.72 +1.9 43.64 perf-profile.children.cycles-pp.futex_wake
2.09 +2.1 4.22 ± 8% perf-profile.children.cycles-pp.futex_hash
0.00 +3.7 3.67 ± 9% perf-profile.children.cycles-pp.__futex_hash
6.11 -1.6 4.52 perf-profile.self.cycles-pp.gup_fast_pgd_range
2.04 -1.5 0.50 ± 5% perf-profile.self.cycles-pp.futex_hash
1.96 -0.7 1.29 perf-profile.self.cycles-pp.gup_fast_fallback
15.96 -0.6 15.40 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
1.10 -0.5 0.62 perf-profile.self.cycles-pp.get_user_pages_fast
2.71 -0.5 2.24 perf-profile.self.cycles-pp.gup_fast
14.25 -0.5 13.78 perf-profile.self.cycles-pp.syscall
10.14 -0.3 9.81 perf-profile.self.cycles-pp.entry_SYSCALL_64
6.69 -0.2 6.48 perf-profile.self.cycles-pp.syscall_exit_to_user_mode
4.85 -0.2 4.68 perf-profile.self.cycles-pp.futex_wake
0.72 -0.1 0.58 perf-profile.self.cycles-pp.is_valid_gup_args
4.31 -0.1 4.17 perf-profile.self.cycles-pp.try_get_folio
1.98 -0.1 1.91 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
1.12 -0.1 1.05 perf-profile.self.cycles-pp.___pte_offset_map
2.06 -0.1 1.99 perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
1.75 -0.1 1.70 perf-profile.self.cycles-pp.do_syscall_64
1.10 -0.0 1.07 perf-profile.self.cycles-pp.__x64_sys_futex
0.78 -0.0 0.75 perf-profile.self.cycles-pp.do_futex
0.65 -0.0 0.63 perf-profile.self.cycles-pp.syscall_return_via_sysret
0.60 -0.0 0.58 perf-profile.self.cycles-pp.x64_sys_call
0.19 -0.0 0.18 perf-profile.self.cycles-pp.futex_hash_put
0.05 +0.0 0.06 perf-profile.self.cycles-pp.intel_thermal_interrupt
0.00 +3.7 3.65 ± 9% perf-profile.self.cycles-pp.__futex_hash
5.84 +3.7 9.49 perf-profile.self.cycles-pp.get_futex_key
***************************************************************************************************
lkp-gnr-2ap2: 384 threads 2 sockets Intel(R) Xeon(R) 6972P (Granite Rapids) with 128G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-12/performance/x86_64-rhel-9.4/process/100%/debian-12-x86_64-20240206.cgz/lkp-gnr-2ap2/futex4/will-it-scale
commit:
63e8595c06 ("futex: Allow to make the private hash immutable")
cec199c5e3 ("futex: Implement FUTEX2_NUMA")
63e8595c060a1fef cec199c5e39bde7191a08087cc3
---------------- ---------------------------
%stddev %change %stddev
\ | \
0.72 ± 4% +0.1 0.82 ± 6% mpstat.cpu.all.irq%
2699383 -13.4% 2337183 ± 6% numa-meminfo.node1.Shmem
123578 ±102% +116.6% 267679 ± 25% numa-numastat.node1.other_node
123578 ±102% +116.6% 267679 ± 25% numa-vmstat.node1.numa_other
2.323e+09 -4.6% 2.216e+09 will-it-scale.384.processes
6049881 -4.6% 5771879 will-it-scale.per_process_ops
2.323e+09 -4.6% 2.216e+09 will-it-scale.workload
2.14 ± 53% -69.0% 0.66 ±149% perf-sched.sch_delay.avg.ms.schedule_timeout.khugepaged_wait_work.khugepaged.kthread
2.14 ± 53% -69.0% 0.66 ±149% perf-sched.sch_delay.max.ms.schedule_timeout.khugepaged_wait_work.khugepaged.kthread
351.13 ±130% +836.1% 3286 ± 49% perf-sched.wait_and_delay.max.ms.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
2.14 ± 53% -68.9% 0.67 ±148% perf-sched.wait_time.avg.ms.schedule_timeout.khugepaged_wait_work.khugepaged.kthread
2.14 ± 53% -68.9% 0.67 ±148% perf-sched.wait_time.max.ms.schedule_timeout.khugepaged_wait_work.khugepaged.kthread
342.14 ±135% +860.6% 3286 ± 49% perf-sched.wait_time.max.ms.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
2.319e+11 +1.8% 2.361e+11 perf-stat.i.branch-instructions
0.92 ± 4% -4.3% 0.88 perf-stat.i.cpi
1.473e+12 +2.2% 1.506e+12 perf-stat.i.instructions
1.11 +2.2% 1.14 perf-stat.i.ipc
0.89 -1.6% 0.88 perf-stat.overall.cpi
1.12 +1.6% 1.14 perf-stat.overall.ipc
193483 +6.3% 205704 perf-stat.overall.path-length
2.31e+11 +1.8% 2.353e+11 perf-stat.ps.branch-instructions
1.468e+12 +2.3% 1.501e+12 perf-stat.ps.instructions
4.495e+14 +1.4% 4.559e+14 perf-stat.total.instructions
3.43 ± 2% -0.6 2.82 ± 2% perf-profile.calltrace.cycles-pp.futex_q_unlock.futex_wait_setup.__futex_wait.futex_wait.do_futex
28.24 -0.6 27.62 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.syscall
9.46 -0.5 8.95 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
3.36 -0.3 3.10 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.syscall
1.76 -0.1 1.68 ± 2% perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.syscall
1.31 -0.1 1.23 perf-profile.calltrace.cycles-pp.testcase
1.13 -0.1 1.06 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
2.09 -0.0 2.04 perf-profile.calltrace.cycles-pp.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
1.64 -0.0 1.61 perf-profile.calltrace.cycles-pp.futex_hash_put.futex_wait_setup.__futex_wait.futex_wait.do_futex
0.72 -0.0 0.70 perf-profile.calltrace.cycles-pp.entry_SYSRETQ_unsafe_stack.syscall
2.16 +0.5 2.63 perf-profile.calltrace.cycles-pp.get_futex_key.futex_wait_setup.__futex_wait.futex_wait.do_futex
2.92 +0.8 3.74 perf-profile.calltrace.cycles-pp.futex_q_lock.futex_wait_setup.__futex_wait.futex_wait.do_futex
68.21 +1.1 69.32 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.syscall
65.61 +1.1 66.75 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
49.34 +1.8 51.14 perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
45.75 +2.0 47.78 perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
43.36 +2.1 45.44 perf-profile.calltrace.cycles-pp.futex_wait.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
3.80 ± 2% +2.1 5.94 ± 2% perf-profile.calltrace.cycles-pp.futex_hash.futex_wait_setup.__futex_wait.futex_wait.do_futex
38.92 +2.3 41.20 perf-profile.calltrace.cycles-pp.__futex_wait.futex_wait.do_futex.__x64_sys_futex.do_syscall_64
29.49 +2.6 32.08 perf-profile.calltrace.cycles-pp.futex_wait_setup.__futex_wait.futex_wait.do_futex.__x64_sys_futex
0.00 +4.5 4.50 perf-profile.calltrace.cycles-pp.__futex_hash.futex_hash.futex_wait_setup.__futex_wait.futex_wait
3.59 ± 2% -0.6 2.99 perf-profile.children.cycles-pp.futex_q_unlock
10.00 -0.6 9.43 perf-profile.children.cycles-pp.syscall_exit_to_user_mode
16.65 -0.5 16.17 perf-profile.children.cycles-pp.entry_SYSCALL_64
6.34 -0.3 6.08 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
2.41 -0.2 2.23 perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
1.44 -0.1 1.33 perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
1.98 -0.1 1.89 perf-profile.children.cycles-pp.syscall_return_via_sysret
1.51 -0.1 1.43 perf-profile.children.cycles-pp.testcase
1.41 -0.1 1.36 perf-profile.children.cycles-pp.futex_hash_put
2.34 -0.1 2.28 perf-profile.children.cycles-pp.x64_sys_call
0.79 -0.0 0.74 perf-profile.children.cycles-pp.futex_setup_timer
0.10 ± 3% +0.0 0.12 ± 4% perf-profile.children.cycles-pp.ktime_get_update_offsets_now
0.16 ± 3% +0.0 0.17 ± 2% perf-profile.children.cycles-pp.sched_tick
0.14 ± 9% +0.1 0.20 ± 16% perf-profile.children.cycles-pp.ktime_get
0.14 ± 11% +0.1 0.20 ± 14% perf-profile.children.cycles-pp.clockevents_program_event
0.34 ± 9% +0.1 0.42 ± 12% perf-profile.children.cycles-pp.update_process_times
0.42 ± 8% +0.1 0.52 ± 12% perf-profile.children.cycles-pp.__hrtimer_run_queues
0.42 ± 9% +0.1 0.51 ± 13% perf-profile.children.cycles-pp.tick_nohz_handler
0.70 ± 7% +0.2 0.86 ± 10% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
0.74 ± 6% +0.2 0.90 ± 10% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
0.68 ± 7% +0.2 0.84 ± 10% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
0.68 ± 7% +0.2 0.84 ± 10% perf-profile.children.cycles-pp.hrtimer_interrupt
2.34 +0.4 2.78 perf-profile.children.cycles-pp.get_futex_key
3.07 +0.8 3.91 perf-profile.children.cycles-pp.futex_q_lock
66.52 +1.1 67.64 perf-profile.children.cycles-pp.do_syscall_64
68.64 +1.1 69.76 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
49.87 +1.8 51.64 perf-profile.children.cycles-pp.__x64_sys_futex
46.54 +2.0 48.55 perf-profile.children.cycles-pp.do_futex
43.64 +2.1 45.70 perf-profile.children.cycles-pp.futex_wait
39.38 +2.3 41.65 perf-profile.children.cycles-pp.__futex_wait
3.95 ± 2% +2.4 6.32 ± 2% perf-profile.children.cycles-pp.futex_hash
31.16 +2.6 33.80 perf-profile.children.cycles-pp.futex_wait_setup
0.00 +4.7 4.74 perf-profile.children.cycles-pp.__futex_hash
3.76 ± 2% -2.2 1.56 ± 9% perf-profile.self.cycles-pp.futex_hash
3.36 ± 2% -0.6 2.80 ± 2% perf-profile.self.cycles-pp.futex_q_unlock
8.55 -0.5 8.08 perf-profile.self.cycles-pp.syscall_exit_to_user_mode
8.11 -0.4 7.75 perf-profile.self.cycles-pp.__futex_wait
15.92 -0.4 15.56 perf-profile.self.cycles-pp.syscall
14.12 -0.3 13.79 perf-profile.self.cycles-pp.futex_wait_setup
4.08 -0.3 3.80 perf-profile.self.cycles-pp.entry_SYSCALL_64
6.12 -0.2 5.88 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
3.32 -0.2 3.09 perf-profile.self.cycles-pp.__x64_sys_futex
3.46 -0.2 3.29 perf-profile.self.cycles-pp.futex_wait
4.49 -0.1 4.38 perf-profile.self.cycles-pp.do_syscall_64
1.98 -0.1 1.89 perf-profile.self.cycles-pp.syscall_return_via_sysret
1.42 -0.1 1.33 perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
1.31 -0.1 1.23 perf-profile.self.cycles-pp.testcase
1.14 -0.1 1.07 perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
3.15 -0.1 3.08 perf-profile.self.cycles-pp.do_futex
2.07 -0.1 2.02 perf-profile.self.cycles-pp.x64_sys_call
0.81 -0.0 0.76 perf-profile.self.cycles-pp.futex_hash_put
0.53 -0.0 0.50 perf-profile.self.cycles-pp.futex_setup_timer
0.10 +0.0 0.12 ± 4% perf-profile.self.cycles-pp.ktime_get_update_offsets_now
0.14 ± 10% +0.1 0.19 ± 16% perf-profile.self.cycles-pp.ktime_get
3.69 +0.1 3.74 perf-profile.self.cycles-pp._raw_spin_lock
2.16 +0.4 2.60 perf-profile.self.cycles-pp.get_futex_key
2.72 +0.7 3.42 perf-profile.self.cycles-pp.futex_q_lock
0.00 +4.5 4.52 perf-profile.self.cycles-pp.__futex_hash
***************************************************************************************************
lkp-srf-2sp1: 256 threads 2 sockets GENUINE INTEL(R) XEON(R) (Sierra Forest) with 128G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-12/performance/x86_64-rhel-9.4/thread/100%/debian-12-x86_64-20240206.cgz/lkp-srf-2sp1/futex2/will-it-scale
commit:
63e8595c06 ("futex: Allow to make the private hash immutable")
cec199c5e3 ("futex: Implement FUTEX2_NUMA")
63e8595c060a1fef cec199c5e39bde7191a08087cc3
---------------- ---------------------------
%stddev %change %stddev
\ | \
41.17 ± 8% -21.1% 32.50 ± 15% perf-sched.wait_and_delay.count.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
22531 ± 3% +13.6% 25603 ± 18% proc-vmstat.numa_pages_migrated
95467 ± 22% +48.6% 141903 ± 19% proc-vmstat.numa_pte_updates
22531 ± 3% +13.6% 25603 ± 18% proc-vmstat.pgmigrate_success
9.212e+08 -3.4% 8.901e+08 will-it-scale.256.threads
3598280 -3.4% 3477060 will-it-scale.per_thread_ops
9.212e+08 -3.4% 8.901e+08 will-it-scale.workload
7870684 ± 34% +203.6% 23897639 ± 30% perf-stat.i.branch-misses
41.81 ± 55% -20.4 21.40 ± 90% perf-stat.i.cache-miss-rate%
0.00 ± 34% +0.0 0.01 ± 30% perf-stat.overall.branch-miss-rate%
33.72 ± 44% -15.4 18.36 ± 69% perf-stat.overall.cache-miss-rate%
364549 +3.4% 376857 perf-stat.overall.path-length
7813564 ± 34% +204.1% 23764749 ± 30% perf-stat.ps.branch-misses
21.56 -2.4 19.17 perf-profile.calltrace.cycles-pp.get_user_pages_fast.get_futex_key.futex_wait_setup.__futex_wait.futex_wait
20.55 -2.3 18.23 perf-profile.calltrace.cycles-pp.gup_fast_fallback.get_user_pages_fast.get_futex_key.futex_wait_setup.__futex_wait
18.89 -1.6 17.27 perf-profile.calltrace.cycles-pp.gup_fast.gup_fast_fallback.get_user_pages_fast.get_futex_key.futex_wait_setup
16.79 -1.2 15.55 perf-profile.calltrace.cycles-pp.gup_fast_pgd_range.gup_fast.gup_fast_fallback.get_user_pages_fast.get_futex_key
23.20 -0.8 22.41 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.syscall
12.14 -0.6 11.58 perf-profile.calltrace.cycles-pp.gup_fast_pte_range.gup_fast_pgd_range.gup_fast.gup_fast_fallback.get_user_pages_fast
2.72 ± 2% -0.2 2.56 perf-profile.calltrace.cycles-pp.futex_q_unlock.futex_wait_setup.__futex_wait.futex_wait.do_futex
4.52 -0.1 4.37 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
2.91 ± 2% -0.1 2.76 ± 2% perf-profile.calltrace.cycles-pp._raw_spin_lock.futex_wait_setup.__futex_wait.futex_wait.do_futex
3.20 -0.1 3.10 perf-profile.calltrace.cycles-pp.try_get_folio.gup_fast_pte_range.gup_fast_pgd_range.gup_fast.gup_fast_fallback
0.88 -0.1 0.78 perf-profile.calltrace.cycles-pp.___pte_offset_map.gup_fast_pte_range.gup_fast_pgd_range.gup_fast.gup_fast_fallback
2.98 -0.1 2.88 perf-profile.calltrace.cycles-pp.entry_SYSRETQ_unsafe_stack.syscall
2.58 -0.1 2.50 perf-profile.calltrace.cycles-pp.futex_q_lock.futex_wait_setup.__futex_wait.futex_wait.do_futex
1.80 -0.1 1.74 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.syscall
26.00 +0.4 26.36 perf-profile.calltrace.cycles-pp.get_futex_key.futex_wait_setup.__futex_wait.futex_wait.do_futex
65.92 +1.1 67.00 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.syscall
64.16 +1.1 65.30 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
57.75 +1.4 59.10 perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
56.92 +1.4 58.29 perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
56.20 +1.4 57.60 perf-profile.calltrace.cycles-pp.futex_wait.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
54.70 +1.5 56.16 perf-profile.calltrace.cycles-pp.__futex_wait.futex_wait.do_futex.__x64_sys_futex.do_syscall_64
46.65 +1.6 48.28 perf-profile.calltrace.cycles-pp.futex_wait_setup.__futex_wait.futex_wait.do_futex.__x64_sys_futex
1.72 ± 7% +1.8 3.55 ± 3% perf-profile.calltrace.cycles-pp.futex_hash.futex_wait_setup.__futex_wait.futex_wait.do_futex
0.00 +3.0 3.01 perf-profile.calltrace.cycles-pp.__futex_hash.futex_hash.futex_wait_setup.__futex_wait.futex_wait
21.79 -2.6 19.24 perf-profile.children.cycles-pp.get_user_pages_fast
20.62 -2.3 18.30 perf-profile.children.cycles-pp.gup_fast_fallback
18.96 -1.6 17.34 perf-profile.children.cycles-pp.gup_fast
16.82 -1.2 15.58 perf-profile.children.cycles-pp.gup_fast_pgd_range
12.21 -0.5 11.66 perf-profile.children.cycles-pp.gup_fast_pte_range
15.66 -0.5 15.13 perf-profile.children.cycles-pp.entry_SYSCALL_64
12.01 -0.4 11.62 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
2.76 ± 2% -0.2 2.60 perf-profile.children.cycles-pp.futex_q_unlock
4.59 -0.2 4.43 perf-profile.children.cycles-pp.syscall_exit_to_user_mode
2.95 ± 2% -0.1 2.80 ± 2% perf-profile.children.cycles-pp._raw_spin_lock
0.59 -0.1 0.47 perf-profile.children.cycles-pp.is_valid_gup_args
0.91 -0.1 0.80 perf-profile.children.cycles-pp.___pte_offset_map
3.24 -0.1 3.13 perf-profile.children.cycles-pp.try_get_folio
2.62 -0.1 2.53 perf-profile.children.cycles-pp.futex_q_lock
2.03 -0.1 1.96 perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
0.41 -0.0 0.39 perf-profile.children.cycles-pp.try_grab_folio_fast
0.50 -0.0 0.49 perf-profile.children.cycles-pp.testcase
0.49 -0.0 0.47 perf-profile.children.cycles-pp.x64_sys_call
0.49 -0.0 0.47 perf-profile.children.cycles-pp.syscall_return_via_sysret
0.15 -0.0 0.14 perf-profile.children.cycles-pp.futex_setup_timer
0.09 +0.0 0.10 ± 3% perf-profile.children.cycles-pp.__sysvec_thermal
0.09 +0.0 0.10 ± 3% perf-profile.children.cycles-pp.intel_thermal_interrupt
26.06 +0.4 26.43 perf-profile.children.cycles-pp.get_futex_key
66.09 +1.1 67.16 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
64.36 +1.1 65.48 perf-profile.children.cycles-pp.do_syscall_64
57.75 +1.4 59.10 perf-profile.children.cycles-pp.__x64_sys_futex
56.92 +1.4 58.29 perf-profile.children.cycles-pp.do_futex
56.22 +1.4 57.62 perf-profile.children.cycles-pp.futex_wait
55.22 +1.4 56.67 perf-profile.children.cycles-pp.__futex_wait
46.26 +1.7 48.00 perf-profile.children.cycles-pp.futex_wait_setup
1.75 ± 7% +1.8 3.57 ± 3% perf-profile.children.cycles-pp.futex_hash
0.00 +3.0 3.04 perf-profile.children.cycles-pp.__futex_hash
1.71 ± 7% -1.2 0.50 ± 23% perf-profile.self.cycles-pp.futex_hash
4.58 -0.7 3.88 ± 2% perf-profile.self.cycles-pp.gup_fast_pgd_range
1.49 -0.5 0.98 ± 2% perf-profile.self.cycles-pp.gup_fast_fallback
11.97 -0.4 11.59 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
2.07 -0.4 1.69 perf-profile.self.cycles-pp.gup_fast
11.78 -0.3 11.46 perf-profile.self.cycles-pp.syscall
9.43 -0.3 9.12 perf-profile.self.cycles-pp.__futex_wait
0.76 -0.3 0.47 perf-profile.self.cycles-pp.get_user_pages_fast
7.34 -0.3 7.08 perf-profile.self.cycles-pp.gup_fast_pte_range
7.64 -0.3 7.38 perf-profile.self.cycles-pp.entry_SYSCALL_64
2.72 ± 2% -0.2 2.56 perf-profile.self.cycles-pp.futex_q_unlock
0.58 -0.1 0.44 perf-profile.self.cycles-pp.is_valid_gup_args
4.20 -0.1 4.06 perf-profile.self.cycles-pp.syscall_exit_to_user_mode
0.88 -0.1 0.76 perf-profile.self.cycles-pp.___pte_offset_map
3.23 -0.1 3.11 perf-profile.self.cycles-pp.try_get_folio
2.61 -0.1 2.52 perf-profile.self.cycles-pp.futex_q_lock
1.53 -0.1 1.48 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
1.54 -0.1 1.49 perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
1.39 -0.0 1.35 perf-profile.self.cycles-pp.do_syscall_64
0.41 -0.0 0.37 perf-profile.self.cycles-pp.try_grab_folio_fast
0.86 -0.0 0.83 perf-profile.self.cycles-pp.futex_wait
0.83 -0.0 0.80 perf-profile.self.cycles-pp.__x64_sys_futex
0.70 -0.0 0.67 perf-profile.self.cycles-pp.do_futex
0.45 -0.0 0.44 perf-profile.self.cycles-pp.x64_sys_call
0.47 -0.0 0.45 perf-profile.self.cycles-pp.testcase
0.45 -0.0 0.44 perf-profile.self.cycles-pp.syscall_return_via_sysret
4.24 +2.9 7.17 perf-profile.self.cycles-pp.get_futex_key
0.00 +3.0 3.03 perf-profile.self.cycles-pp.__futex_hash
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2025-05-16 2:13 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-16 2:13 [tip:locking/futex] [futex] cec199c5e3: will-it-scale.per_thread_ops 3.2% regression kernel test robot
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.