* [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression
@ 2024-08-04 8:59 kernel test robot
2024-08-04 20:32 ` Linus Torvalds
` (2 more replies)
0 siblings, 3 replies; 29+ messages in thread
From: kernel test robot @ 2024-08-04 8:59 UTC (permalink / raw)
To: Jeff Xu
Cc: oe-lkp, lkp, linux-kernel, Andrew Morton, Kees Cook,
Liam R. Howlett, Pedro Falcato, Dave Hansen, Greg Kroah-Hartman,
Guenter Roeck, Jann Horn, Jeff Xu, Jonathan Corbet,
Jorge Lucangeli Obes, Linus Torvalds, Matthew Wilcox,
Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan,
Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api,
linux-mm, ying.huang, feng.tang, fengwei.yin, oliver.sang
Hello,
kernel test robot noticed a -4.4% regression of stress-ng.pagemove.page_remaps_per_sec on:
commit: 8be7258aad44b5e25977a98db136f677fa6f4370 ("mseal: add mseal syscall")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
testcase: stress-ng
test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
parameters:
nr_threads: 100%
testtime: 60s
test: pagemove
cpufreq_governor: performance
In addition to that, the commit also has significant impact on the following tests:
+------------------+---------------------------------------------------------------------------------------------+
| testcase: change | stress-ng: stress-ng.pkey.ops_per_sec -3.6% regression |
| test machine | 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory |
| test parameters | cpufreq_governor=performance |
| | nr_threads=100% |
| | test=pkey |
| | testtime=60s |
+------------------+---------------------------------------------------------------------------------------------+
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202408041602.caa0372-oliver.sang@intel.com
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240804/202408041602.caa0372-oliver.sang@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-13/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/pagemove/stress-ng/60s
commit:
ff388fe5c4 ("mseal: wire up mseal syscall")
8be7258aad ("mseal: add mseal syscall")
ff388fe5c481d39c 8be7258aad44b5e25977a98db13
---------------- ---------------------------
%stddev %change %stddev
\ | \
41625945 -4.3% 39842322 proc-vmstat.numa_hit
41559175 -4.3% 39774160 proc-vmstat.numa_local
77484314 -4.4% 74105555 proc-vmstat.pgalloc_normal
77205752 -4.4% 73826672 proc-vmstat.pgfree
18361466 -4.2% 17596652 stress-ng.pagemove.ops
306014 -4.2% 293262 stress-ng.pagemove.ops_per_sec
205312 -4.4% 196176 stress-ng.pagemove.page_remaps_per_sec
4961 +1.0% 5013 stress-ng.time.percent_of_cpu_this_job_got
2917 +1.2% 2952 stress-ng.time.system_time
1.07 -6.6% 1.00 perf-stat.i.MPKI
3.354e+10 +3.5% 3.473e+10 perf-stat.i.branch-instructions
1.795e+08 -4.2% 1.719e+08 perf-stat.i.cache-misses
2.376e+08 -4.1% 2.279e+08 perf-stat.i.cache-references
1.13 -3.0% 1.10 perf-stat.i.cpi
1077 +4.3% 1124 perf-stat.i.cycles-between-cache-misses
1.717e+11 +2.7% 1.762e+11 perf-stat.i.instructions
0.88 +3.1% 0.91 perf-stat.i.ipc
1.05 -6.8% 0.97 perf-stat.overall.MPKI
0.25 ± 2% -0.0 0.24 perf-stat.overall.branch-miss-rate%
1.13 -3.0% 1.10 perf-stat.overall.cpi
1084 +4.0% 1127 perf-stat.overall.cycles-between-cache-misses
0.88 +3.1% 0.91 perf-stat.overall.ipc
3.298e+10 +3.5% 3.415e+10 perf-stat.ps.branch-instructions
1.764e+08 -4.3% 1.689e+08 perf-stat.ps.cache-misses
2.336e+08 -4.1% 2.24e+08 perf-stat.ps.cache-references
194.57 -2.4% 189.96 ± 2% perf-stat.ps.cpu-migrations
1.688e+11 +2.7% 1.733e+11 perf-stat.ps.instructions
1.036e+13 +3.0% 1.068e+13 perf-stat.total.instructions
75.12 -1.9 73.22 perf-profile.calltrace.cycles-pp.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
36.84 -1.6 35.29 perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64
24.90 -1.2 23.72 perf-profile.calltrace.cycles-pp.copy_vma.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
19.89 -0.9 18.98 perf-profile.calltrace.cycles-pp.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
10.56 ± 2% -0.8 9.78 ± 2% perf-profile.calltrace.cycles-pp.rcu_core.handle_softirqs.run_ksoftirqd.smpboot_thread_fn.kthread
10.56 ± 2% -0.8 9.79 ± 2% perf-profile.calltrace.cycles-pp.handle_softirqs.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork
10.56 ± 2% -0.8 9.79 ± 2% perf-profile.calltrace.cycles-pp.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
10.57 ± 2% -0.8 9.80 ± 2% perf-profile.calltrace.cycles-pp.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
10.52 ± 2% -0.8 9.75 ± 2% perf-profile.calltrace.cycles-pp.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd.smpboot_thread_fn
10.62 ± 2% -0.8 9.85 ± 2% perf-profile.calltrace.cycles-pp.kthread.ret_from_fork.ret_from_fork_asm
10.62 ± 2% -0.8 9.85 ± 2% perf-profile.calltrace.cycles-pp.ret_from_fork.ret_from_fork_asm
10.62 ± 2% -0.8 9.85 ± 2% perf-profile.calltrace.cycles-pp.ret_from_fork_asm
14.75 -0.7 14.07 perf-profile.calltrace.cycles-pp.vma_merge.copy_vma.move_vma.__do_sys_mremap.do_syscall_64
1.50 -0.6 0.94 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64
5.88 ± 2% -0.4 5.47 ± 2% perf-profile.calltrace.cycles-pp.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd
7.80 -0.3 7.47 perf-profile.calltrace.cycles-pp.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
4.55 ± 2% -0.3 4.24 ± 2% perf-profile.calltrace.cycles-pp.__slab_free.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs
6.76 -0.3 6.45 perf-profile.calltrace.cycles-pp.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma
6.15 -0.3 5.86 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap
8.22 -0.3 7.93 perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
6.12 -0.3 5.87 perf-profile.calltrace.cycles-pp.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma
5.74 -0.2 5.50 perf-profile.calltrace.cycles-pp.move_ptes.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64
3.16 ± 2% -0.2 2.94 perf-profile.calltrace.cycles-pp.vm_area_free_rcu_cb.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd
5.50 -0.2 5.28 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap
1.36 -0.2 1.14 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap
5.15 -0.2 4.94 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap
5.51 -0.2 5.31 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_merge.copy_vma.move_vma.__do_sys_mremap
5.16 -0.2 4.97 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma.move_vma
2.24 -0.2 2.05 perf-profile.calltrace.cycles-pp.find_vma_prev.copy_vma.move_vma.__do_sys_mremap.do_syscall_64
2.60 ± 2% -0.2 2.42 ± 2% perf-profile.calltrace.cycles-pp.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core.handle_softirqs
4.67 -0.2 4.49 perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma
3.41 -0.2 3.23 perf-profile.calltrace.cycles-pp.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma
3.00 -0.2 2.83 ± 2% perf-profile.calltrace.cycles-pp.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma
0.96 -0.2 0.80 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.do_munmap.mremap_to
4.04 -0.2 3.88 perf-profile.calltrace.cycles-pp.vm_area_dup.copy_vma.move_vma.__do_sys_mremap.do_syscall_64
3.20 ± 2% -0.2 3.04 ± 2% perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap
3.53 -0.1 3.38 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma.__do_sys_mremap
3.40 -0.1 3.26 perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma.__do_sys_mremap
2.20 ± 2% -0.1 2.06 ± 2% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap
1.84 ± 3% -0.1 1.71 ± 3% perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.anon_vma_clone.__split_vma.do_vmi_align_munmap
1.78 ± 2% -0.1 1.65 ± 3% perf-profile.calltrace.cycles-pp.vma_prepare.vma_merge.copy_vma.move_vma.__do_sys_mremap
2.69 -0.1 2.56 perf-profile.calltrace.cycles-pp.unlink_anon_vmas.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap
1.78 ± 2% -0.1 1.66 ± 2% perf-profile.calltrace.cycles-pp.__slab_free.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core
1.36 ± 2% -0.1 1.23 ± 2% perf-profile.calltrace.cycles-pp.rcu_cblist_dequeue.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd
0.95 -0.1 0.83 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap
3.29 -0.1 3.17 perf-profile.calltrace.cycles-pp.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
2.08 -0.1 1.96 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_merge.copy_vma.move_vma.__do_sys_mremap
1.43 ± 3% -0.1 1.32 ± 3% perf-profile.calltrace.cycles-pp.down_write.vma_prepare.vma_merge.copy_vma.move_vma
2.21 -0.1 2.10 perf-profile.calltrace.cycles-pp.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma
2.47 -0.1 2.36 perf-profile.calltrace.cycles-pp.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma
2.21 -0.1 2.12 perf-profile.calltrace.cycles-pp.native_flush_tlb_one_user.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables
1.41 -0.1 1.32 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap
1.26 -0.1 1.18 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap
1.82 -0.1 1.75 perf-profile.calltrace.cycles-pp.vma_link.copy_vma.move_vma.__do_sys_mremap.do_syscall_64
0.71 -0.1 0.63 perf-profile.calltrace.cycles-pp.mas_prev.vma_merge.copy_vma.move_vma.__do_sys_mremap
1.29 -0.1 1.22 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma
0.61 -0.1 0.54 perf-profile.calltrace.cycles-pp.mas_prev_setup.mas_prev.vma_merge.copy_vma.move_vma
1.36 -0.1 1.29 perf-profile.calltrace.cycles-pp.kmem_cache_free.unlink_anon_vmas.free_pgtables.unmap_region.do_vmi_align_munmap
1.40 -0.1 1.33 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma.move_vma
0.70 -0.1 0.64 perf-profile.calltrace.cycles-pp.mas_walk.find_vma_prev.copy_vma.move_vma.__do_sys_mremap
1.23 -0.1 1.17 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma
1.66 -0.1 1.60 perf-profile.calltrace.cycles-pp.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.16 -0.1 1.10 perf-profile.calltrace.cycles-pp.anon_vma_clone.copy_vma.move_vma.__do_sys_mremap.do_syscall_64
0.96 -0.1 0.90 perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.unlink_anon_vmas.free_pgtables.unmap_region
1.14 -0.1 1.08 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap
0.79 -0.1 0.74 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma
1.04 -0.1 1.00 perf-profile.calltrace.cycles-pp.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.58 -0.0 0.53 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.mremap
0.61 -0.0 0.56 perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core
0.56 -0.0 0.52 perf-profile.calltrace.cycles-pp.mas_find.find_vma_prev.copy_vma.move_vma.__do_sys_mremap
0.57 -0.0 0.53 ± 2% perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs
0.78 -0.0 0.74 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge
0.88 -0.0 0.84 perf-profile.calltrace.cycles-pp.mtree_load.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64
0.70 -0.0 0.66 perf-profile.calltrace.cycles-pp.__call_rcu_common.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
0.68 -0.0 0.64 perf-profile.calltrace.cycles-pp.percpu_counter_add_batch.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
0.68 -0.0 0.64 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.mremap
0.97 -0.0 0.93 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
1.11 -0.0 1.08 perf-profile.calltrace.cycles-pp.clear_bhb_loop.mremap
0.75 -0.0 0.72 perf-profile.calltrace.cycles-pp.allocate_slab.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma
0.74 -0.0 0.71 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.anon_vma_clone.copy_vma.move_vma.__do_sys_mremap
0.60 ± 2% -0.0 0.57 perf-profile.calltrace.cycles-pp.security_mmap_addr.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64
0.67 ± 2% -0.0 0.64 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma
0.82 -0.0 0.79 perf-profile.calltrace.cycles-pp.mtree_load.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
0.63 -0.0 0.60 perf-profile.calltrace.cycles-pp.mas_prev_slot.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
0.99 -0.0 0.96 perf-profile.calltrace.cycles-pp.mt_find.vma_merge.copy_vma.move_vma.__do_sys_mremap
0.62 ± 2% -0.0 0.59 perf-profile.calltrace.cycles-pp.get_old_pud.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64
0.87 -0.0 0.84 perf-profile.calltrace.cycles-pp.userfaultfd_unmap_complete.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
0.78 -0.0 0.75 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_link.copy_vma.move_vma.__do_sys_mremap
0.64 -0.0 0.62 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_link.copy_vma.move_vma.__do_sys_mremap
0.90 -0.0 0.87 perf-profile.calltrace.cycles-pp.mtree_load.vma_merge.copy_vma.move_vma.__do_sys_mremap
0.54 -0.0 0.52 perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.move_ptes.move_page_tables.move_vma.__do_sys_mremap
1.04 +0.0 1.08 perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region
0.76 +0.1 0.83 perf-profile.calltrace.cycles-pp.__madvise
0.63 +0.1 0.70 perf-profile.calltrace.cycles-pp.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
0.62 +0.1 0.70 perf-profile.calltrace.cycles-pp.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
0.66 +0.1 0.74 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__madvise
0.66 +0.1 0.74 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
87.74 +0.7 88.45 perf-profile.calltrace.cycles-pp.mremap
0.00 +0.9 0.86 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.do_munmap
0.00 +0.9 0.86 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.mremap_to.__do_sys_mremap
84.88 +0.9 85.77 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.mremap
84.73 +0.9 85.62 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
0.00 +0.9 0.92 ± 2% perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.move_vma
83.84 +0.9 84.78 perf-profile.calltrace.cycles-pp.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
0.00 +1.1 1.06 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64
0.00 +1.2 1.21 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to
2.07 +1.5 3.55 perf-profile.calltrace.cycles-pp.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.58 +1.5 3.07 perf-profile.calltrace.cycles-pp.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64
0.00 +1.5 1.52 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap
0.00 +1.6 1.57 perf-profile.calltrace.cycles-pp.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.00 +1.7 1.72 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap
0.00 +2.0 2.01 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64
5.39 +2.9 8.32 perf-profile.calltrace.cycles-pp.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
75.29 -1.9 73.37 perf-profile.children.cycles-pp.move_vma
37.06 -1.6 35.50 perf-profile.children.cycles-pp.do_vmi_align_munmap
24.98 -1.2 23.80 perf-profile.children.cycles-pp.copy_vma
19.99 -1.0 19.02 perf-profile.children.cycles-pp.handle_softirqs
19.97 -1.0 19.00 perf-profile.children.cycles-pp.rcu_core
19.95 -1.0 18.98 perf-profile.children.cycles-pp.rcu_do_batch
19.98 -0.9 19.06 perf-profile.children.cycles-pp.__split_vma
17.55 -0.8 16.76 perf-profile.children.cycles-pp.kmem_cache_free
10.56 ± 2% -0.8 9.79 ± 2% perf-profile.children.cycles-pp.run_ksoftirqd
10.57 ± 2% -0.8 9.80 ± 2% perf-profile.children.cycles-pp.smpboot_thread_fn
15.38 -0.8 14.62 perf-profile.children.cycles-pp.kmem_cache_alloc_noprof
10.62 ± 2% -0.8 9.85 ± 2% perf-profile.children.cycles-pp.kthread
10.62 ± 2% -0.8 9.86 ± 2% perf-profile.children.cycles-pp.ret_from_fork
10.62 ± 2% -0.8 9.86 ± 2% perf-profile.children.cycles-pp.ret_from_fork_asm
15.14 -0.7 14.44 perf-profile.children.cycles-pp.vma_merge
12.08 -0.5 11.55 perf-profile.children.cycles-pp.__slab_free
12.11 -0.5 11.62 perf-profile.children.cycles-pp.mas_wr_store_entry
10.86 -0.5 10.39 perf-profile.children.cycles-pp.vm_area_dup
11.89 -0.5 11.44 perf-profile.children.cycles-pp.mas_store_prealloc
8.49 -0.4 8.06 perf-profile.children.cycles-pp.__memcg_slab_post_alloc_hook
9.88 -0.4 9.49 perf-profile.children.cycles-pp.mas_wr_node_store
7.91 -0.3 7.58 perf-profile.children.cycles-pp.move_page_tables
6.06 -0.3 5.78 perf-profile.children.cycles-pp.vm_area_free_rcu_cb
8.28 -0.3 8.00 perf-profile.children.cycles-pp.unmap_region
6.69 -0.3 6.42 perf-profile.children.cycles-pp.vma_complete
5.06 -0.3 4.80 perf-profile.children.cycles-pp.mas_preallocate
5.82 -0.2 5.57 perf-profile.children.cycles-pp.move_ptes
4.24 -0.2 4.01 perf-profile.children.cycles-pp.anon_vma_clone
3.50 -0.2 3.30 perf-profile.children.cycles-pp.down_write
2.44 -0.2 2.25 perf-profile.children.cycles-pp.find_vma_prev
3.46 -0.2 3.28 perf-profile.children.cycles-pp.___slab_alloc
3.45 -0.2 3.27 perf-profile.children.cycles-pp.free_pgtables
2.54 -0.2 2.37 perf-profile.children.cycles-pp.rcu_cblist_dequeue
3.35 -0.2 3.18 perf-profile.children.cycles-pp.__memcg_slab_free_hook
2.93 -0.2 2.78 perf-profile.children.cycles-pp.mas_alloc_nodes
2.28 ± 2% -0.2 2.12 ± 2% perf-profile.children.cycles-pp.vma_prepare
3.46 -0.1 3.32 perf-profile.children.cycles-pp.flush_tlb_mm_range
3.41 -0.1 3.27 ± 2% perf-profile.children.cycles-pp.mod_objcg_state
2.76 -0.1 2.63 perf-profile.children.cycles-pp.unlink_anon_vmas
3.41 -0.1 3.28 perf-profile.children.cycles-pp.mas_store_gfp
2.21 -0.1 2.09 perf-profile.children.cycles-pp.__cond_resched
2.04 -0.1 1.94 perf-profile.children.cycles-pp.allocate_slab
2.10 -0.1 2.00 perf-profile.children.cycles-pp.__call_rcu_common
2.51 -0.1 2.40 perf-profile.children.cycles-pp.flush_tlb_func
1.04 -0.1 0.94 perf-profile.children.cycles-pp.mas_prev
2.71 -0.1 2.61 perf-profile.children.cycles-pp.mtree_load
2.23 -0.1 2.14 perf-profile.children.cycles-pp.native_flush_tlb_one_user
0.22 ± 5% -0.1 0.13 ± 13% perf-profile.children.cycles-pp.vm_stat_account
0.95 -0.1 0.87 perf-profile.children.cycles-pp.mas_prev_setup
1.65 -0.1 1.57 perf-profile.children.cycles-pp.mas_wr_walk
1.84 -0.1 1.76 perf-profile.children.cycles-pp.up_write
1.27 -0.1 1.20 perf-profile.children.cycles-pp.mas_prev_slot
1.84 -0.1 1.77 perf-profile.children.cycles-pp.vma_link
1.39 -0.1 1.32 perf-profile.children.cycles-pp.shuffle_freelist
0.96 -0.1 0.90 ± 2% perf-profile.children.cycles-pp.rcu_all_qs
0.86 -0.1 0.80 perf-profile.children.cycles-pp._raw_spin_lock_irqsave
1.70 -0.1 1.64 perf-profile.children.cycles-pp.__get_unmapped_area
0.34 ± 3% -0.1 0.29 ± 5% perf-profile.children.cycles-pp.security_vm_enough_memory_mm
0.60 -0.0 0.55 perf-profile.children.cycles-pp.entry_SYSCALL_64
0.92 -0.0 0.87 perf-profile.children.cycles-pp.percpu_counter_add_batch
1.07 -0.0 1.02 perf-profile.children.cycles-pp.vma_to_resize
1.59 -0.0 1.54 perf-profile.children.cycles-pp.mas_update_gap
0.44 ± 2% -0.0 0.40 ± 2% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
0.70 -0.0 0.66 perf-profile.children.cycles-pp.syscall_return_via_sysret
1.13 -0.0 1.09 perf-profile.children.cycles-pp.mt_find
0.20 ± 6% -0.0 0.17 ± 9% perf-profile.children.cycles-pp.cap_vm_enough_memory
0.99 -0.0 0.95 perf-profile.children.cycles-pp.mas_pop_node
0.63 ± 2% -0.0 0.59 perf-profile.children.cycles-pp.security_mmap_addr
0.62 -0.0 0.59 perf-profile.children.cycles-pp.__put_partials
1.17 -0.0 1.14 perf-profile.children.cycles-pp.clear_bhb_loop
0.46 -0.0 0.43 ± 2% perf-profile.children.cycles-pp.__alloc_pages_noprof
0.44 -0.0 0.41 ± 2% perf-profile.children.cycles-pp.get_page_from_freelist
0.90 -0.0 0.87 perf-profile.children.cycles-pp.userfaultfd_unmap_complete
0.64 ± 2% -0.0 0.62 perf-profile.children.cycles-pp.get_old_pud
1.07 -0.0 1.05 perf-profile.children.cycles-pp.mas_leaf_max_gap
0.22 ± 3% -0.0 0.20 ± 2% perf-profile.children.cycles-pp.__rmqueue_pcplist
0.55 -0.0 0.53 perf-profile.children.cycles-pp.refill_obj_stock
0.25 -0.0 0.23 ± 3% perf-profile.children.cycles-pp.rmqueue
0.48 -0.0 0.45 perf-profile.children.cycles-pp.mremap_userfaultfd_prep
0.33 -0.0 0.30 perf-profile.children.cycles-pp.free_unref_page
0.46 -0.0 0.44 perf-profile.children.cycles-pp.setup_object
0.21 ± 3% -0.0 0.19 ± 2% perf-profile.children.cycles-pp.rmqueue_bulk
0.31 ± 3% -0.0 0.29 perf-profile.children.cycles-pp.__vm_enough_memory
0.40 -0.0 0.38 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
0.36 -0.0 0.35 perf-profile.children.cycles-pp.madvise_vma_behavior
0.54 -0.0 0.53 ± 2% perf-profile.children.cycles-pp.mas_wr_end_piv
0.46 -0.0 0.44 ± 2% perf-profile.children.cycles-pp.rcu_segcblist_enqueue
0.34 -0.0 0.32 ± 2% perf-profile.children.cycles-pp.mas_destroy
0.28 -0.0 0.26 ± 3% perf-profile.children.cycles-pp.mas_wr_store_setup
0.30 -0.0 0.28 perf-profile.children.cycles-pp.pte_offset_map_nolock
0.19 -0.0 0.18 ± 2% perf-profile.children.cycles-pp.__thp_vma_allowable_orders
0.08 ± 4% -0.0 0.07 perf-profile.children.cycles-pp.ksm_madvise
0.17 -0.0 0.16 perf-profile.children.cycles-pp.get_any_partial
0.08 -0.0 0.07 perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
0.45 +0.0 0.47 perf-profile.children.cycles-pp._raw_spin_lock
1.10 +0.0 1.14 perf-profile.children.cycles-pp.zap_pte_range
0.78 +0.1 0.85 perf-profile.children.cycles-pp.__madvise
0.63 +0.1 0.70 perf-profile.children.cycles-pp.__x64_sys_madvise
0.62 +0.1 0.70 perf-profile.children.cycles-pp.do_madvise
0.00 +0.1 0.09 ± 4% perf-profile.children.cycles-pp.can_modify_mm_madv
1.32 +0.1 1.46 perf-profile.children.cycles-pp.mas_next_slot
88.13 +0.7 88.83 perf-profile.children.cycles-pp.mremap
83.94 +0.9 84.88 perf-profile.children.cycles-pp.__do_sys_mremap
86.06 +0.9 87.00 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
85.56 +1.0 86.54 perf-profile.children.cycles-pp.do_syscall_64
40.49 +1.4 41.90 perf-profile.children.cycles-pp.do_vmi_munmap
2.10 +1.5 3.57 perf-profile.children.cycles-pp.do_munmap
3.62 +2.3 5.90 perf-profile.children.cycles-pp.mas_walk
5.44 +2.9 8.38 perf-profile.children.cycles-pp.mremap_to
5.30 +3.1 8.39 perf-profile.children.cycles-pp.mas_find
0.00 +5.4 5.40 perf-profile.children.cycles-pp.can_modify_mm
11.46 -0.5 10.96 perf-profile.self.cycles-pp.__slab_free
4.30 -0.2 4.08 perf-profile.self.cycles-pp.__memcg_slab_post_alloc_hook
2.51 -0.2 2.34 perf-profile.self.cycles-pp.rcu_cblist_dequeue
2.41 ± 2% -0.2 2.25 perf-profile.self.cycles-pp.down_write
2.21 -0.1 2.11 perf-profile.self.cycles-pp.native_flush_tlb_one_user
2.37 -0.1 2.28 perf-profile.self.cycles-pp.mtree_load
1.60 -0.1 1.51 perf-profile.self.cycles-pp.__memcg_slab_free_hook
0.18 ± 3% -0.1 0.10 ± 15% perf-profile.self.cycles-pp.vm_stat_account
1.25 -0.1 1.18 perf-profile.self.cycles-pp.move_vma
1.76 -0.1 1.69 perf-profile.self.cycles-pp.mod_objcg_state
1.42 -0.1 1.35 ± 2% perf-profile.self.cycles-pp.__call_rcu_common
1.41 -0.1 1.34 perf-profile.self.cycles-pp.mas_wr_walk
1.52 -0.1 1.46 perf-profile.self.cycles-pp.up_write
1.02 -0.1 0.95 perf-profile.self.cycles-pp.mas_prev_slot
0.96 -0.1 0.90 ± 2% perf-profile.self.cycles-pp.vm_area_free_rcu_cb
1.50 -0.1 1.45 perf-profile.self.cycles-pp.kmem_cache_free
0.69 ± 3% -0.1 0.64 ± 2% perf-profile.self.cycles-pp.rcu_all_qs
1.14 ± 2% -0.1 1.09 perf-profile.self.cycles-pp.shuffle_freelist
1.10 -0.1 1.05 perf-profile.self.cycles-pp.__cond_resched
1.40 -0.0 1.35 perf-profile.self.cycles-pp.do_vmi_align_munmap
0.99 -0.0 0.94 perf-profile.self.cycles-pp.mas_preallocate
0.88 -0.0 0.83 perf-profile.self.cycles-pp.___slab_alloc
0.55 -0.0 0.50 perf-profile.self.cycles-pp.mremap_to
0.98 -0.0 0.93 perf-profile.self.cycles-pp.move_ptes
0.78 -0.0 0.74 perf-profile.self.cycles-pp.percpu_counter_add_batch
0.21 ± 2% -0.0 0.18 ± 2% perf-profile.self.cycles-pp.entry_SYSCALL_64
0.44 ± 2% -0.0 0.40 ± 2% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
0.92 -0.0 0.89 perf-profile.self.cycles-pp.mas_store_gfp
0.86 -0.0 0.82 perf-profile.self.cycles-pp.mas_pop_node
0.50 -0.0 0.46 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
1.15 -0.0 1.12 perf-profile.self.cycles-pp.clear_bhb_loop
1.14 -0.0 1.11 perf-profile.self.cycles-pp.vma_merge
0.66 -0.0 0.63 perf-profile.self.cycles-pp.__split_vma
0.16 ± 6% -0.0 0.13 ± 7% perf-profile.self.cycles-pp.cap_vm_enough_memory
0.82 -0.0 0.79 perf-profile.self.cycles-pp.mas_wr_store_entry
0.54 ± 2% -0.0 0.52 perf-profile.self.cycles-pp.get_old_pud
0.43 -0.0 0.40 perf-profile.self.cycles-pp.do_munmap
0.51 ± 2% -0.0 0.48 ± 2% perf-profile.self.cycles-pp.security_mmap_addr
0.50 -0.0 0.48 perf-profile.self.cycles-pp.refill_obj_stock
0.24 -0.0 0.22 perf-profile.self.cycles-pp.mas_prev
0.71 -0.0 0.69 perf-profile.self.cycles-pp.unmap_page_range
0.48 -0.0 0.45 perf-profile.self.cycles-pp.find_vma_prev
0.42 -0.0 0.40 perf-profile.self.cycles-pp._raw_spin_lock_irqsave
0.66 -0.0 0.64 perf-profile.self.cycles-pp.mas_store_prealloc
0.31 -0.0 0.29 perf-profile.self.cycles-pp.mas_prev_setup
0.43 -0.0 0.41 perf-profile.self.cycles-pp.mas_wr_end_piv
0.78 -0.0 0.76 perf-profile.self.cycles-pp.userfaultfd_unmap_complete
0.28 -0.0 0.26 ± 2% perf-profile.self.cycles-pp.mas_put_in_tree
0.42 -0.0 0.40 perf-profile.self.cycles-pp.mremap_userfaultfd_prep
0.28 -0.0 0.26 perf-profile.self.cycles-pp.free_pgtables
0.39 -0.0 0.37 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
0.30 ± 2% -0.0 0.28 perf-profile.self.cycles-pp.zap_pmd_range
0.32 -0.0 0.31 perf-profile.self.cycles-pp.unmap_vmas
0.21 -0.0 0.20 perf-profile.self.cycles-pp.__get_unmapped_area
0.18 ± 2% -0.0 0.17 ± 2% perf-profile.self.cycles-pp.lru_add_drain_cpu
0.06 -0.0 0.05 perf-profile.self.cycles-pp.ksm_madvise
0.45 +0.0 0.46 perf-profile.self.cycles-pp.do_vmi_munmap
0.37 +0.0 0.39 perf-profile.self.cycles-pp._raw_spin_lock
1.06 +0.1 1.18 perf-profile.self.cycles-pp.mas_next_slot
1.50 +0.5 1.97 perf-profile.self.cycles-pp.mas_find
0.00 +1.4 1.35 perf-profile.self.cycles-pp.can_modify_mm
3.13 +2.0 5.13 perf-profile.self.cycles-pp.mas_walk
***************************************************************************************************
lkp-spr-r02: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-13/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-spr-r02/pkey/stress-ng/60s
commit:
ff388fe5c4 ("mseal: wire up mseal syscall")
8be7258aad ("mseal: add mseal syscall")
ff388fe5c481d39c 8be7258aad44b5e25977a98db13
---------------- ---------------------------
%stddev %change %stddev
\ | \
10539 -2.5% 10273 vmstat.system.cs
0.28 ± 5% -20.1% 0.22 ± 7% sched_debug.cfs_rq:/.h_nr_running.stddev
1419 ± 7% -15.3% 1202 ± 6% sched_debug.cfs_rq:/.util_avg.max
0.28 ± 6% -18.4% 0.23 ± 8% sched_debug.cpu.nr_running.stddev
8.736e+08 -3.6% 8.423e+08 stress-ng.pkey.ops
14560560 -3.6% 14038795 stress-ng.pkey.ops_per_sec
770.39 ± 4% -5.0% 732.04 stress-ng.time.user_time
244657 ± 3% +5.8% 258782 ± 3% proc-vmstat.nr_slab_unreclaimable
73133541 -2.1% 71588873 proc-vmstat.numa_hit
72873579 -2.1% 71357274 proc-vmstat.numa_local
1.842e+08 -2.5% 1.796e+08 proc-vmstat.pgalloc_normal
1.767e+08 -2.8% 1.717e+08 proc-vmstat.pgfree
1345346 ± 40% -73.1% 362064 ±124% numa-vmstat.node0.nr_inactive_anon
1345340 ± 40% -73.1% 362062 ±124% numa-vmstat.node0.nr_zone_inactive_anon
2420830 ± 14% +35.1% 3270248 ± 16% numa-vmstat.node1.nr_file_pages
2067871 ± 13% +51.5% 3132982 ± 17% numa-vmstat.node1.nr_inactive_anon
191406 ± 17% +33.6% 255808 ± 14% numa-vmstat.node1.nr_mapped
2452 ± 61% +104.4% 5012 ± 35% numa-vmstat.node1.nr_page_table_pages
2067853 ± 13% +51.5% 3132966 ± 17% numa-vmstat.node1.nr_zone_inactive_anon
5379238 ± 40% -73.0% 1453605 ±123% numa-meminfo.node0.Inactive
5379166 ± 40% -73.0% 1453462 ±123% numa-meminfo.node0.Inactive(anon)
8741077 ± 22% -36.7% 5531290 ± 28% numa-meminfo.node0.MemUsed
9651902 ± 13% +35.8% 13105318 ± 16% numa-meminfo.node1.FilePages
8239855 ± 13% +52.4% 12556929 ± 17% numa-meminfo.node1.Inactive
8239712 ± 13% +52.4% 12556853 ± 17% numa-meminfo.node1.Inactive(anon)
761944 ± 18% +34.6% 1025906 ± 14% numa-meminfo.node1.Mapped
11679628 ± 11% +31.2% 15322841 ± 14% numa-meminfo.node1.MemUsed
9874 ± 62% +104.6% 20200 ± 36% numa-meminfo.node1.PageTables
0.74 -4.2% 0.71 perf-stat.i.MPKI
1.245e+11 +2.3% 1.274e+11 perf-stat.i.branch-instructions
0.37 -0.0 0.35 perf-stat.i.branch-miss-rate%
4.359e+08 -2.1% 4.265e+08 perf-stat.i.branch-misses
4.672e+08 -2.6% 4.548e+08 perf-stat.i.cache-misses
7.276e+08 -2.7% 7.082e+08 perf-stat.i.cache-references
1.00 -1.6% 0.98 perf-stat.i.cpi
1364 +2.9% 1404 perf-stat.i.cycles-between-cache-misses
6.392e+11 +1.7% 6.499e+11 perf-stat.i.instructions
1.00 +1.6% 1.02 perf-stat.i.ipc
0.74 -4.3% 0.71 perf-stat.overall.MPKI
0.35 -0.0 0.33 perf-stat.overall.branch-miss-rate%
1.00 -1.6% 0.99 perf-stat.overall.cpi
1356 +2.9% 1395 perf-stat.overall.cycles-between-cache-misses
1.00 +1.6% 1.01 perf-stat.overall.ipc
1.209e+11 +1.9% 1.232e+11 perf-stat.ps.branch-instructions
4.188e+08 -2.6% 4.077e+08 perf-stat.ps.branch-misses
4.585e+08 -3.1% 4.441e+08 perf-stat.ps.cache-misses
7.124e+08 -3.1% 6.901e+08 perf-stat.ps.cache-references
10321 -2.6% 10053 perf-stat.ps.context-switches
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression
2024-08-04 8:59 [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression kernel test robot
@ 2024-08-04 20:32 ` Linus Torvalds
2024-08-05 13:33 ` Pedro Falcato
2024-08-05 17:54 ` Jeff Xu
2024-08-05 13:56 ` Jeff Xu
2024-08-05 16:58 ` Jeff Xu
2 siblings, 2 replies; 29+ messages in thread
From: Linus Torvalds @ 2024-08-04 20:32 UTC (permalink / raw)
To: kernel test robot
Cc: Jeff Xu, oe-lkp, lkp, linux-kernel, Andrew Morton, Kees Cook,
Liam R. Howlett, Pedro Falcato, Dave Hansen, Greg Kroah-Hartman,
Guenter Roeck, Jann Horn, Jeff Xu, Jonathan Corbet,
Jorge Lucangeli Obes, Matthew Wilcox, Muhammad Usama Anjum,
Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany,
Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang,
feng.tang, fengwei.yin
On Sun, 4 Aug 2024 at 01:59, kernel test robot <oliver.sang@intel.com> wrote:
>
> kernel test robot noticed a -4.4% regression of stress-ng.pagemove.page_remaps_per_sec on
> commit 8be7258aad44 ("mseal: add mseal syscall")
Ok, it's basically just the vma walk in can_modify_mm():
> 1.06 +0.1 1.18 perf-profile.self.cycles-pp.mas_next_slot
> 1.50 +0.5 1.97 perf-profile.self.cycles-pp.mas_find
> 0.00 +1.4 1.35 perf-profile.self.cycles-pp.can_modify_mm
> 3.13 +2.0 5.13 perf-profile.self.cycles-pp.mas_walk
and looks like it's two different pathways. We have __do_sys_mremap ->
mremap_to -> do_munmap -> do_vmi_munmap -> can_modify_mm for the
destination mapping, but we also have mremap_to() calling
can_modify_mm() directly for the source mapping.
And then do_vmi_munmap() will do it's *own* vma_find() after having
done arch_unmap().
And do_munmap() will obviously do its own vma lookup as part of
calling vma_to_resize().
So it looks like a large portion of this regression is because the
mseal addition just ends up walking the vma list way too much.
Linus
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression
2024-08-04 20:32 ` Linus Torvalds
@ 2024-08-05 13:33 ` Pedro Falcato
2024-08-05 18:10 ` Jeff Xu
2024-08-05 17:54 ` Jeff Xu
1 sibling, 1 reply; 29+ messages in thread
From: Pedro Falcato @ 2024-08-05 13:33 UTC (permalink / raw)
To: Linus Torvalds
Cc: kernel test robot, Jeff Xu, oe-lkp, lkp, linux-kernel,
Andrew Morton, Kees Cook, Liam R. Howlett, Dave Hansen,
Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jeff Xu,
Jonathan Corbet, Jorge Lucangeli Obes, Matthew Wilcox,
Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan,
Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api,
linux-mm, ying.huang, feng.tang, fengwei.yin
On Sun, Aug 4, 2024 at 9:33 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Sun, 4 Aug 2024 at 01:59, kernel test robot <oliver.sang@intel.com> wrote:
> >
> > kernel test robot noticed a -4.4% regression of stress-ng.pagemove.page_remaps_per_sec on
> > commit 8be7258aad44 ("mseal: add mseal syscall")
>
> Ok, it's basically just the vma walk in can_modify_mm():
>
> > 1.06 +0.1 1.18 perf-profile.self.cycles-pp.mas_next_slot
> > 1.50 +0.5 1.97 perf-profile.self.cycles-pp.mas_find
> > 0.00 +1.4 1.35 perf-profile.self.cycles-pp.can_modify_mm
> > 3.13 +2.0 5.13 perf-profile.self.cycles-pp.mas_walk
>
> and looks like it's two different pathways. We have __do_sys_mremap ->
> mremap_to -> do_munmap -> do_vmi_munmap -> can_modify_mm for the
> destination mapping, but we also have mremap_to() calling
> can_modify_mm() directly for the source mapping.
>
> And then do_vmi_munmap() will do it's *own* vma_find() after having
> done arch_unmap().
>
> And do_munmap() will obviously do its own vma lookup as part of
> calling vma_to_resize().
>
> So it looks like a large portion of this regression is because the
> mseal addition just ends up walking the vma list way too much.
Can we rollback the upfront checks "funny business" and just call
can_modify_vma directly in relevant places? I still don't believe in
the partial mprotect/munmap "security risks" that were stated in the
mseal thread (and these operations can already fail for many other
reasons than mseal) :)
I don't mind taking a look myself, just want to make sure I'm not
stepping on anyone's toes here.
--
Pedro
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression
2024-08-04 8:59 [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression kernel test robot
2024-08-04 20:32 ` Linus Torvalds
@ 2024-08-05 13:56 ` Jeff Xu
2024-08-05 16:58 ` Jeff Xu
2 siblings, 0 replies; 29+ messages in thread
From: Jeff Xu @ 2024-08-05 13:56 UTC (permalink / raw)
To: kernel test robot
Cc: oe-lkp, lkp, linux-kernel, Andrew Morton, Kees Cook,
Liam R. Howlett, Pedro Falcato, Dave Hansen, Greg Kroah-Hartman,
Guenter Roeck, Jann Horn, Jeff Xu, Jonathan Corbet,
Jorge Lucangeli Obes, Linus Torvalds, Matthew Wilcox,
Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan,
Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api,
linux-mm, ying.huang, feng.tang, fengwei.yin
On Sun, Aug 4, 2024 at 1:59 AM kernel test robot <oliver.sang@intel.com> wrote:
>
>
>
> Hello,
>
> kernel test robot noticed a -4.4% regression of stress-ng.pagemove.page_remaps_per_sec on:
>
Looking.
I'm setting up the environment so I can repro. .
>
> commit: 8be7258aad44b5e25977a98db136f677fa6f4370 ("mseal: add mseal syscall")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>
> testcase: stress-ng
> test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
> parameters:
>
> nr_threads: 100%
> testtime: 60s
> test: pagemove
> cpufreq_governor: performance
>
>
> In addition to that, the commit also has significant impact on the following tests:
>
> +------------------+---------------------------------------------------------------------------------------------+
> | testcase: change | stress-ng: stress-ng.pkey.ops_per_sec -3.6% regression |
> | test machine | 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory |
> | test parameters | cpufreq_governor=performance |
> | | nr_threads=100% |
> | | test=pkey |
> | | testtime=60s |
> +------------------+---------------------------------------------------------------------------------------------+
>
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <oliver.sang@intel.com>
> | Closes: https://lore.kernel.org/oe-lkp/202408041602.caa0372-oliver.sang@intel.com
>
>
> Details are as below:
> -------------------------------------------------------------------------------------------------->
>
>
> The kernel config and materials to reproduce are available at:
> https://download.01.org/0day-ci/archive/20240804/202408041602.caa0372-oliver.sang@intel.com
>
> =========================================================================================
> compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
> gcc-13/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/pagemove/stress-ng/60s
>
> commit:
> ff388fe5c4 ("mseal: wire up mseal syscall")
> 8be7258aad ("mseal: add mseal syscall")
>
> ff388fe5c481d39c 8be7258aad44b5e25977a98db13
> ---------------- ---------------------------
> %stddev %change %stddev
> \ | \
> 41625945 -4.3% 39842322 proc-vmstat.numa_hit
> 41559175 -4.3% 39774160 proc-vmstat.numa_local
> 77484314 -4.4% 74105555 proc-vmstat.pgalloc_normal
> 77205752 -4.4% 73826672 proc-vmstat.pgfree
> 18361466 -4.2% 17596652 stress-ng.pagemove.ops
> 306014 -4.2% 293262 stress-ng.pagemove.ops_per_sec
> 205312 -4.4% 196176 stress-ng.pagemove.page_remaps_per_sec
> 4961 +1.0% 5013 stress-ng.time.percent_of_cpu_this_job_got
> 2917 +1.2% 2952 stress-ng.time.system_time
> 1.07 -6.6% 1.00 perf-stat.i.MPKI
> 3.354e+10 +3.5% 3.473e+10 perf-stat.i.branch-instructions
> 1.795e+08 -4.2% 1.719e+08 perf-stat.i.cache-misses
> 2.376e+08 -4.1% 2.279e+08 perf-stat.i.cache-references
> 1.13 -3.0% 1.10 perf-stat.i.cpi
> 1077 +4.3% 1124 perf-stat.i.cycles-between-cache-misses
> 1.717e+11 +2.7% 1.762e+11 perf-stat.i.instructions
> 0.88 +3.1% 0.91 perf-stat.i.ipc
> 1.05 -6.8% 0.97 perf-stat.overall.MPKI
> 0.25 ą 2% -0.0 0.24 perf-stat.overall.branch-miss-rate%
> 1.13 -3.0% 1.10 perf-stat.overall.cpi
> 1084 +4.0% 1127 perf-stat.overall.cycles-between-cache-misses
> 0.88 +3.1% 0.91 perf-stat.overall.ipc
> 3.298e+10 +3.5% 3.415e+10 perf-stat.ps.branch-instructions
> 1.764e+08 -4.3% 1.689e+08 perf-stat.ps.cache-misses
> 2.336e+08 -4.1% 2.24e+08 perf-stat.ps.cache-references
> 194.57 -2.4% 189.96 ą 2% perf-stat.ps.cpu-migrations
> 1.688e+11 +2.7% 1.733e+11 perf-stat.ps.instructions
> 1.036e+13 +3.0% 1.068e+13 perf-stat.total.instructions
> 75.12 -1.9 73.22 perf-profile.calltrace.cycles-pp.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
> 36.84 -1.6 35.29 perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64
> 24.90 -1.2 23.72 perf-profile.calltrace.cycles-pp.copy_vma.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 19.89 -0.9 18.98 perf-profile.calltrace.cycles-pp.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
> 10.56 ą 2% -0.8 9.78 ą 2% perf-profile.calltrace.cycles-pp.rcu_core.handle_softirqs.run_ksoftirqd.smpboot_thread_fn.kthread
> 10.56 ą 2% -0.8 9.79 ą 2% perf-profile.calltrace.cycles-pp.handle_softirqs.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork
> 10.56 ą 2% -0.8 9.79 ą 2% perf-profile.calltrace.cycles-pp.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
> 10.57 ą 2% -0.8 9.80 ą 2% perf-profile.calltrace.cycles-pp.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
> 10.52 ą 2% -0.8 9.75 ą 2% perf-profile.calltrace.cycles-pp.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd.smpboot_thread_fn
> 10.62 ą 2% -0.8 9.85 ą 2% perf-profile.calltrace.cycles-pp.kthread.ret_from_fork.ret_from_fork_asm
> 10.62 ą 2% -0.8 9.85 ą 2% perf-profile.calltrace.cycles-pp.ret_from_fork.ret_from_fork_asm
> 10.62 ą 2% -0.8 9.85 ą 2% perf-profile.calltrace.cycles-pp.ret_from_fork_asm
> 14.75 -0.7 14.07 perf-profile.calltrace.cycles-pp.vma_merge.copy_vma.move_vma.__do_sys_mremap.do_syscall_64
> 1.50 -0.6 0.94 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64
> 5.88 ą 2% -0.4 5.47 ą 2% perf-profile.calltrace.cycles-pp.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd
> 7.80 -0.3 7.47 perf-profile.calltrace.cycles-pp.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 4.55 ą 2% -0.3 4.24 ą 2% perf-profile.calltrace.cycles-pp.__slab_free.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs
> 6.76 -0.3 6.45 perf-profile.calltrace.cycles-pp.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma
> 6.15 -0.3 5.86 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap
> 8.22 -0.3 7.93 perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
> 6.12 -0.3 5.87 perf-profile.calltrace.cycles-pp.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma
> 5.74 -0.2 5.50 perf-profile.calltrace.cycles-pp.move_ptes.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64
> 3.16 ą 2% -0.2 2.94 perf-profile.calltrace.cycles-pp.vm_area_free_rcu_cb.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd
> 5.50 -0.2 5.28 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap
> 1.36 -0.2 1.14 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap
> 5.15 -0.2 4.94 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap
> 5.51 -0.2 5.31 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_merge.copy_vma.move_vma.__do_sys_mremap
> 5.16 -0.2 4.97 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma.move_vma
> 2.24 -0.2 2.05 perf-profile.calltrace.cycles-pp.find_vma_prev.copy_vma.move_vma.__do_sys_mremap.do_syscall_64
> 2.60 ą 2% -0.2 2.42 ą 2% perf-profile.calltrace.cycles-pp.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core.handle_softirqs
> 4.67 -0.2 4.49 perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma
> 3.41 -0.2 3.23 perf-profile.calltrace.cycles-pp.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma
> 3.00 -0.2 2.83 ą 2% perf-profile.calltrace.cycles-pp.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma
> 0.96 -0.2 0.80 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.do_munmap.mremap_to
> 4.04 -0.2 3.88 perf-profile.calltrace.cycles-pp.vm_area_dup.copy_vma.move_vma.__do_sys_mremap.do_syscall_64
> 3.20 ą 2% -0.2 3.04 ą 2% perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap
> 3.53 -0.1 3.38 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma.__do_sys_mremap
> 3.40 -0.1 3.26 perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma.__do_sys_mremap
> 2.20 ą 2% -0.1 2.06 ą 2% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap
> 1.84 ą 3% -0.1 1.71 ą 3% perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.anon_vma_clone.__split_vma.do_vmi_align_munmap
> 1.78 ą 2% -0.1 1.65 ą 3% perf-profile.calltrace.cycles-pp.vma_prepare.vma_merge.copy_vma.move_vma.__do_sys_mremap
> 2.69 -0.1 2.56 perf-profile.calltrace.cycles-pp.unlink_anon_vmas.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap
> 1.78 ą 2% -0.1 1.66 ą 2% perf-profile.calltrace.cycles-pp.__slab_free.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core
> 1.36 ą 2% -0.1 1.23 ą 2% perf-profile.calltrace.cycles-pp.rcu_cblist_dequeue.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd
> 0.95 -0.1 0.83 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap
> 3.29 -0.1 3.17 perf-profile.calltrace.cycles-pp.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
> 2.08 -0.1 1.96 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_merge.copy_vma.move_vma.__do_sys_mremap
> 1.43 ą 3% -0.1 1.32 ą 3% perf-profile.calltrace.cycles-pp.down_write.vma_prepare.vma_merge.copy_vma.move_vma
> 2.21 -0.1 2.10 perf-profile.calltrace.cycles-pp.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma
> 2.47 -0.1 2.36 perf-profile.calltrace.cycles-pp.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma
> 2.21 -0.1 2.12 perf-profile.calltrace.cycles-pp.native_flush_tlb_one_user.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables
> 1.41 -0.1 1.32 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap
> 1.26 -0.1 1.18 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap
> 1.82 -0.1 1.75 perf-profile.calltrace.cycles-pp.vma_link.copy_vma.move_vma.__do_sys_mremap.do_syscall_64
> 0.71 -0.1 0.63 perf-profile.calltrace.cycles-pp.mas_prev.vma_merge.copy_vma.move_vma.__do_sys_mremap
> 1.29 -0.1 1.22 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma
> 0.61 -0.1 0.54 perf-profile.calltrace.cycles-pp.mas_prev_setup.mas_prev.vma_merge.copy_vma.move_vma
> 1.36 -0.1 1.29 perf-profile.calltrace.cycles-pp.kmem_cache_free.unlink_anon_vmas.free_pgtables.unmap_region.do_vmi_align_munmap
> 1.40 -0.1 1.33 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma.move_vma
> 0.70 -0.1 0.64 perf-profile.calltrace.cycles-pp.mas_walk.find_vma_prev.copy_vma.move_vma.__do_sys_mremap
> 1.23 -0.1 1.17 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma
> 1.66 -0.1 1.60 perf-profile.calltrace.cycles-pp.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 1.16 -0.1 1.10 perf-profile.calltrace.cycles-pp.anon_vma_clone.copy_vma.move_vma.__do_sys_mremap.do_syscall_64
> 0.96 -0.1 0.90 perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.unlink_anon_vmas.free_pgtables.unmap_region
> 1.14 -0.1 1.08 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap
> 0.79 -0.1 0.74 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma
> 1.04 -0.1 1.00 perf-profile.calltrace.cycles-pp.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 0.58 -0.0 0.53 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.mremap
> 0.61 -0.0 0.56 perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core
> 0.56 -0.0 0.52 perf-profile.calltrace.cycles-pp.mas_find.find_vma_prev.copy_vma.move_vma.__do_sys_mremap
> 0.57 -0.0 0.53 ą 2% perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs
> 0.78 -0.0 0.74 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge
> 0.88 -0.0 0.84 perf-profile.calltrace.cycles-pp.mtree_load.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64
> 0.70 -0.0 0.66 perf-profile.calltrace.cycles-pp.__call_rcu_common.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
> 0.68 -0.0 0.64 perf-profile.calltrace.cycles-pp.percpu_counter_add_batch.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
> 0.68 -0.0 0.64 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.mremap
> 0.97 -0.0 0.93 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
> 1.11 -0.0 1.08 perf-profile.calltrace.cycles-pp.clear_bhb_loop.mremap
> 0.75 -0.0 0.72 perf-profile.calltrace.cycles-pp.allocate_slab.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma
> 0.74 -0.0 0.71 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.anon_vma_clone.copy_vma.move_vma.__do_sys_mremap
> 0.60 ą 2% -0.0 0.57 perf-profile.calltrace.cycles-pp.security_mmap_addr.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64
> 0.67 ą 2% -0.0 0.64 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma
> 0.82 -0.0 0.79 perf-profile.calltrace.cycles-pp.mtree_load.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
> 0.63 -0.0 0.60 perf-profile.calltrace.cycles-pp.mas_prev_slot.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
> 0.99 -0.0 0.96 perf-profile.calltrace.cycles-pp.mt_find.vma_merge.copy_vma.move_vma.__do_sys_mremap
> 0.62 ą 2% -0.0 0.59 perf-profile.calltrace.cycles-pp.get_old_pud.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64
> 0.87 -0.0 0.84 perf-profile.calltrace.cycles-pp.userfaultfd_unmap_complete.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
> 0.78 -0.0 0.75 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_link.copy_vma.move_vma.__do_sys_mremap
> 0.64 -0.0 0.62 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_link.copy_vma.move_vma.__do_sys_mremap
> 0.90 -0.0 0.87 perf-profile.calltrace.cycles-pp.mtree_load.vma_merge.copy_vma.move_vma.__do_sys_mremap
> 0.54 -0.0 0.52 perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.move_ptes.move_page_tables.move_vma.__do_sys_mremap
> 1.04 +0.0 1.08 perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region
> 0.76 +0.1 0.83 perf-profile.calltrace.cycles-pp.__madvise
> 0.63 +0.1 0.70 perf-profile.calltrace.cycles-pp.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
> 0.62 +0.1 0.70 perf-profile.calltrace.cycles-pp.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
> 0.66 +0.1 0.74 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__madvise
> 0.66 +0.1 0.74 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
> 87.74 +0.7 88.45 perf-profile.calltrace.cycles-pp.mremap
> 0.00 +0.9 0.86 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.do_munmap
> 0.00 +0.9 0.86 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.mremap_to.__do_sys_mremap
> 84.88 +0.9 85.77 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.mremap
> 84.73 +0.9 85.62 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
> 0.00 +0.9 0.92 ą 2% perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.move_vma
> 83.84 +0.9 84.78 perf-profile.calltrace.cycles-pp.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
> 0.00 +1.1 1.06 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64
> 0.00 +1.2 1.21 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to
> 2.07 +1.5 3.55 perf-profile.calltrace.cycles-pp.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 1.58 +1.5 3.07 perf-profile.calltrace.cycles-pp.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64
> 0.00 +1.5 1.52 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap
> 0.00 +1.6 1.57 perf-profile.calltrace.cycles-pp.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 0.00 +1.7 1.72 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap
> 0.00 +2.0 2.01 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64
> 5.39 +2.9 8.32 perf-profile.calltrace.cycles-pp.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
> 75.29 -1.9 73.37 perf-profile.children.cycles-pp.move_vma
> 37.06 -1.6 35.50 perf-profile.children.cycles-pp.do_vmi_align_munmap
> 24.98 -1.2 23.80 perf-profile.children.cycles-pp.copy_vma
> 19.99 -1.0 19.02 perf-profile.children.cycles-pp.handle_softirqs
> 19.97 -1.0 19.00 perf-profile.children.cycles-pp.rcu_core
> 19.95 -1.0 18.98 perf-profile.children.cycles-pp.rcu_do_batch
> 19.98 -0.9 19.06 perf-profile.children.cycles-pp.__split_vma
> 17.55 -0.8 16.76 perf-profile.children.cycles-pp.kmem_cache_free
> 10.56 ą 2% -0.8 9.79 ą 2% perf-profile.children.cycles-pp.run_ksoftirqd
> 10.57 ą 2% -0.8 9.80 ą 2% perf-profile.children.cycles-pp.smpboot_thread_fn
> 15.38 -0.8 14.62 perf-profile.children.cycles-pp.kmem_cache_alloc_noprof
> 10.62 ą 2% -0.8 9.85 ą 2% perf-profile.children.cycles-pp.kthread
> 10.62 ą 2% -0.8 9.86 ą 2% perf-profile.children.cycles-pp.ret_from_fork
> 10.62 ą 2% -0.8 9.86 ą 2% perf-profile.children.cycles-pp.ret_from_fork_asm
> 15.14 -0.7 14.44 perf-profile.children.cycles-pp.vma_merge
> 12.08 -0.5 11.55 perf-profile.children.cycles-pp.__slab_free
> 12.11 -0.5 11.62 perf-profile.children.cycles-pp.mas_wr_store_entry
> 10.86 -0.5 10.39 perf-profile.children.cycles-pp.vm_area_dup
> 11.89 -0.5 11.44 perf-profile.children.cycles-pp.mas_store_prealloc
> 8.49 -0.4 8.06 perf-profile.children.cycles-pp.__memcg_slab_post_alloc_hook
> 9.88 -0.4 9.49 perf-profile.children.cycles-pp.mas_wr_node_store
> 7.91 -0.3 7.58 perf-profile.children.cycles-pp.move_page_tables
> 6.06 -0.3 5.78 perf-profile.children.cycles-pp.vm_area_free_rcu_cb
> 8.28 -0.3 8.00 perf-profile.children.cycles-pp.unmap_region
> 6.69 -0.3 6.42 perf-profile.children.cycles-pp.vma_complete
> 5.06 -0.3 4.80 perf-profile.children.cycles-pp.mas_preallocate
> 5.82 -0.2 5.57 perf-profile.children.cycles-pp.move_ptes
> 4.24 -0.2 4.01 perf-profile.children.cycles-pp.anon_vma_clone
> 3.50 -0.2 3.30 perf-profile.children.cycles-pp.down_write
> 2.44 -0.2 2.25 perf-profile.children.cycles-pp.find_vma_prev
> 3.46 -0.2 3.28 perf-profile.children.cycles-pp.___slab_alloc
> 3.45 -0.2 3.27 perf-profile.children.cycles-pp.free_pgtables
> 2.54 -0.2 2.37 perf-profile.children.cycles-pp.rcu_cblist_dequeue
> 3.35 -0.2 3.18 perf-profile.children.cycles-pp.__memcg_slab_free_hook
> 2.93 -0.2 2.78 perf-profile.children.cycles-pp.mas_alloc_nodes
> 2.28 ą 2% -0.2 2.12 ą 2% perf-profile.children.cycles-pp.vma_prepare
> 3.46 -0.1 3.32 perf-profile.children.cycles-pp.flush_tlb_mm_range
> 3.41 -0.1 3.27 ą 2% perf-profile.children.cycles-pp.mod_objcg_state
> 2.76 -0.1 2.63 perf-profile.children.cycles-pp.unlink_anon_vmas
> 3.41 -0.1 3.28 perf-profile.children.cycles-pp.mas_store_gfp
> 2.21 -0.1 2.09 perf-profile.children.cycles-pp.__cond_resched
> 2.04 -0.1 1.94 perf-profile.children.cycles-pp.allocate_slab
> 2.10 -0.1 2.00 perf-profile.children.cycles-pp.__call_rcu_common
> 2.51 -0.1 2.40 perf-profile.children.cycles-pp.flush_tlb_func
> 1.04 -0.1 0.94 perf-profile.children.cycles-pp.mas_prev
> 2.71 -0.1 2.61 perf-profile.children.cycles-pp.mtree_load
> 2.23 -0.1 2.14 perf-profile.children.cycles-pp.native_flush_tlb_one_user
> 0.22 ą 5% -0.1 0.13 ą 13% perf-profile.children.cycles-pp.vm_stat_account
> 0.95 -0.1 0.87 perf-profile.children.cycles-pp.mas_prev_setup
> 1.65 -0.1 1.57 perf-profile.children.cycles-pp.mas_wr_walk
> 1.84 -0.1 1.76 perf-profile.children.cycles-pp.up_write
> 1.27 -0.1 1.20 perf-profile.children.cycles-pp.mas_prev_slot
> 1.84 -0.1 1.77 perf-profile.children.cycles-pp.vma_link
> 1.39 -0.1 1.32 perf-profile.children.cycles-pp.shuffle_freelist
> 0.96 -0.1 0.90 ą 2% perf-profile.children.cycles-pp.rcu_all_qs
> 0.86 -0.1 0.80 perf-profile.children.cycles-pp._raw_spin_lock_irqsave
> 1.70 -0.1 1.64 perf-profile.children.cycles-pp.__get_unmapped_area
> 0.34 ą 3% -0.1 0.29 ą 5% perf-profile.children.cycles-pp.security_vm_enough_memory_mm
> 0.60 -0.0 0.55 perf-profile.children.cycles-pp.entry_SYSCALL_64
> 0.92 -0.0 0.87 perf-profile.children.cycles-pp.percpu_counter_add_batch
> 1.07 -0.0 1.02 perf-profile.children.cycles-pp.vma_to_resize
> 1.59 -0.0 1.54 perf-profile.children.cycles-pp.mas_update_gap
> 0.44 ą 2% -0.0 0.40 ą 2% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
> 0.70 -0.0 0.66 perf-profile.children.cycles-pp.syscall_return_via_sysret
> 1.13 -0.0 1.09 perf-profile.children.cycles-pp.mt_find
> 0.20 ą 6% -0.0 0.17 ą 9% perf-profile.children.cycles-pp.cap_vm_enough_memory
> 0.99 -0.0 0.95 perf-profile.children.cycles-pp.mas_pop_node
> 0.63 ą 2% -0.0 0.59 perf-profile.children.cycles-pp.security_mmap_addr
> 0.62 -0.0 0.59 perf-profile.children.cycles-pp.__put_partials
> 1.17 -0.0 1.14 perf-profile.children.cycles-pp.clear_bhb_loop
> 0.46 -0.0 0.43 ą 2% perf-profile.children.cycles-pp.__alloc_pages_noprof
> 0.44 -0.0 0.41 ą 2% perf-profile.children.cycles-pp.get_page_from_freelist
> 0.90 -0.0 0.87 perf-profile.children.cycles-pp.userfaultfd_unmap_complete
> 0.64 ą 2% -0.0 0.62 perf-profile.children.cycles-pp.get_old_pud
> 1.07 -0.0 1.05 perf-profile.children.cycles-pp.mas_leaf_max_gap
> 0.22 ą 3% -0.0 0.20 ą 2% perf-profile.children.cycles-pp.__rmqueue_pcplist
> 0.55 -0.0 0.53 perf-profile.children.cycles-pp.refill_obj_stock
> 0.25 -0.0 0.23 ą 3% perf-profile.children.cycles-pp.rmqueue
> 0.48 -0.0 0.45 perf-profile.children.cycles-pp.mremap_userfaultfd_prep
> 0.33 -0.0 0.30 perf-profile.children.cycles-pp.free_unref_page
> 0.46 -0.0 0.44 perf-profile.children.cycles-pp.setup_object
> 0.21 ą 3% -0.0 0.19 ą 2% perf-profile.children.cycles-pp.rmqueue_bulk
> 0.31 ą 3% -0.0 0.29 perf-profile.children.cycles-pp.__vm_enough_memory
> 0.40 -0.0 0.38 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
> 0.36 -0.0 0.35 perf-profile.children.cycles-pp.madvise_vma_behavior
> 0.54 -0.0 0.53 ą 2% perf-profile.children.cycles-pp.mas_wr_end_piv
> 0.46 -0.0 0.44 ą 2% perf-profile.children.cycles-pp.rcu_segcblist_enqueue
> 0.34 -0.0 0.32 ą 2% perf-profile.children.cycles-pp.mas_destroy
> 0.28 -0.0 0.26 ą 3% perf-profile.children.cycles-pp.mas_wr_store_setup
> 0.30 -0.0 0.28 perf-profile.children.cycles-pp.pte_offset_map_nolock
> 0.19 -0.0 0.18 ą 2% perf-profile.children.cycles-pp.__thp_vma_allowable_orders
> 0.08 ą 4% -0.0 0.07 perf-profile.children.cycles-pp.ksm_madvise
> 0.17 -0.0 0.16 perf-profile.children.cycles-pp.get_any_partial
> 0.08 -0.0 0.07 perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
> 0.45 +0.0 0.47 perf-profile.children.cycles-pp._raw_spin_lock
> 1.10 +0.0 1.14 perf-profile.children.cycles-pp.zap_pte_range
> 0.78 +0.1 0.85 perf-profile.children.cycles-pp.__madvise
> 0.63 +0.1 0.70 perf-profile.children.cycles-pp.__x64_sys_madvise
> 0.62 +0.1 0.70 perf-profile.children.cycles-pp.do_madvise
> 0.00 +0.1 0.09 ą 4% perf-profile.children.cycles-pp.can_modify_mm_madv
> 1.32 +0.1 1.46 perf-profile.children.cycles-pp.mas_next_slot
> 88.13 +0.7 88.83 perf-profile.children.cycles-pp.mremap
> 83.94 +0.9 84.88 perf-profile.children.cycles-pp.__do_sys_mremap
> 86.06 +0.9 87.00 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
> 85.56 +1.0 86.54 perf-profile.children.cycles-pp.do_syscall_64
> 40.49 +1.4 41.90 perf-profile.children.cycles-pp.do_vmi_munmap
> 2.10 +1.5 3.57 perf-profile.children.cycles-pp.do_munmap
> 3.62 +2.3 5.90 perf-profile.children.cycles-pp.mas_walk
> 5.44 +2.9 8.38 perf-profile.children.cycles-pp.mremap_to
> 5.30 +3.1 8.39 perf-profile.children.cycles-pp.mas_find
> 0.00 +5.4 5.40 perf-profile.children.cycles-pp.can_modify_mm
> 11.46 -0.5 10.96 perf-profile.self.cycles-pp.__slab_free
> 4.30 -0.2 4.08 perf-profile.self.cycles-pp.__memcg_slab_post_alloc_hook
> 2.51 -0.2 2.34 perf-profile.self.cycles-pp.rcu_cblist_dequeue
> 2.41 ą 2% -0.2 2.25 perf-profile.self.cycles-pp.down_write
> 2.21 -0.1 2.11 perf-profile.self.cycles-pp.native_flush_tlb_one_user
> 2.37 -0.1 2.28 perf-profile.self.cycles-pp.mtree_load
> 1.60 -0.1 1.51 perf-profile.self.cycles-pp.__memcg_slab_free_hook
> 0.18 ą 3% -0.1 0.10 ą 15% perf-profile.self.cycles-pp.vm_stat_account
> 1.25 -0.1 1.18 perf-profile.self.cycles-pp.move_vma
> 1.76 -0.1 1.69 perf-profile.self.cycles-pp.mod_objcg_state
> 1.42 -0.1 1.35 ą 2% perf-profile.self.cycles-pp.__call_rcu_common
> 1.41 -0.1 1.34 perf-profile.self.cycles-pp.mas_wr_walk
> 1.52 -0.1 1.46 perf-profile.self.cycles-pp.up_write
> 1.02 -0.1 0.95 perf-profile.self.cycles-pp.mas_prev_slot
> 0.96 -0.1 0.90 ą 2% perf-profile.self.cycles-pp.vm_area_free_rcu_cb
> 1.50 -0.1 1.45 perf-profile.self.cycles-pp.kmem_cache_free
> 0.69 ą 3% -0.1 0.64 ą 2% perf-profile.self.cycles-pp.rcu_all_qs
> 1.14 ą 2% -0.1 1.09 perf-profile.self.cycles-pp.shuffle_freelist
> 1.10 -0.1 1.05 perf-profile.self.cycles-pp.__cond_resched
> 1.40 -0.0 1.35 perf-profile.self.cycles-pp.do_vmi_align_munmap
> 0.99 -0.0 0.94 perf-profile.self.cycles-pp.mas_preallocate
> 0.88 -0.0 0.83 perf-profile.self.cycles-pp.___slab_alloc
> 0.55 -0.0 0.50 perf-profile.self.cycles-pp.mremap_to
> 0.98 -0.0 0.93 perf-profile.self.cycles-pp.move_ptes
> 0.78 -0.0 0.74 perf-profile.self.cycles-pp.percpu_counter_add_batch
> 0.21 ą 2% -0.0 0.18 ą 2% perf-profile.self.cycles-pp.entry_SYSCALL_64
> 0.44 ą 2% -0.0 0.40 ą 2% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
> 0.92 -0.0 0.89 perf-profile.self.cycles-pp.mas_store_gfp
> 0.86 -0.0 0.82 perf-profile.self.cycles-pp.mas_pop_node
> 0.50 -0.0 0.46 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
> 1.15 -0.0 1.12 perf-profile.self.cycles-pp.clear_bhb_loop
> 1.14 -0.0 1.11 perf-profile.self.cycles-pp.vma_merge
> 0.66 -0.0 0.63 perf-profile.self.cycles-pp.__split_vma
> 0.16 ą 6% -0.0 0.13 ą 7% perf-profile.self.cycles-pp.cap_vm_enough_memory
> 0.82 -0.0 0.79 perf-profile.self.cycles-pp.mas_wr_store_entry
> 0.54 ą 2% -0.0 0.52 perf-profile.self.cycles-pp.get_old_pud
> 0.43 -0.0 0.40 perf-profile.self.cycles-pp.do_munmap
> 0.51 ą 2% -0.0 0.48 ą 2% perf-profile.self.cycles-pp.security_mmap_addr
> 0.50 -0.0 0.48 perf-profile.self.cycles-pp.refill_obj_stock
> 0.24 -0.0 0.22 perf-profile.self.cycles-pp.mas_prev
> 0.71 -0.0 0.69 perf-profile.self.cycles-pp.unmap_page_range
> 0.48 -0.0 0.45 perf-profile.self.cycles-pp.find_vma_prev
> 0.42 -0.0 0.40 perf-profile.self.cycles-pp._raw_spin_lock_irqsave
> 0.66 -0.0 0.64 perf-profile.self.cycles-pp.mas_store_prealloc
> 0.31 -0.0 0.29 perf-profile.self.cycles-pp.mas_prev_setup
> 0.43 -0.0 0.41 perf-profile.self.cycles-pp.mas_wr_end_piv
> 0.78 -0.0 0.76 perf-profile.self.cycles-pp.userfaultfd_unmap_complete
> 0.28 -0.0 0.26 ą 2% perf-profile.self.cycles-pp.mas_put_in_tree
> 0.42 -0.0 0.40 perf-profile.self.cycles-pp.mremap_userfaultfd_prep
> 0.28 -0.0 0.26 perf-profile.self.cycles-pp.free_pgtables
> 0.39 -0.0 0.37 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
> 0.30 ą 2% -0.0 0.28 perf-profile.self.cycles-pp.zap_pmd_range
> 0.32 -0.0 0.31 perf-profile.self.cycles-pp.unmap_vmas
> 0.21 -0.0 0.20 perf-profile.self.cycles-pp.__get_unmapped_area
> 0.18 ą 2% -0.0 0.17 ą 2% perf-profile.self.cycles-pp.lru_add_drain_cpu
> 0.06 -0.0 0.05 perf-profile.self.cycles-pp.ksm_madvise
> 0.45 +0.0 0.46 perf-profile.self.cycles-pp.do_vmi_munmap
> 0.37 +0.0 0.39 perf-profile.self.cycles-pp._raw_spin_lock
> 1.06 +0.1 1.18 perf-profile.self.cycles-pp.mas_next_slot
> 1.50 +0.5 1.97 perf-profile.self.cycles-pp.mas_find
> 0.00 +1.4 1.35 perf-profile.self.cycles-pp.can_modify_mm
> 3.13 +2.0 5.13 perf-profile.self.cycles-pp.mas_walk
>
>
> ***************************************************************************************************
> lkp-spr-r02: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory
> =========================================================================================
> compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
> gcc-13/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-spr-r02/pkey/stress-ng/60s
>
> commit:
> ff388fe5c4 ("mseal: wire up mseal syscall")
> 8be7258aad ("mseal: add mseal syscall")
>
> ff388fe5c481d39c 8be7258aad44b5e25977a98db13
> ---------------- ---------------------------
> %stddev %change %stddev
> \ | \
> 10539 -2.5% 10273 vmstat.system.cs
> 0.28 ą 5% -20.1% 0.22 ą 7% sched_debug.cfs_rq:/.h_nr_running.stddev
> 1419 ą 7% -15.3% 1202 ą 6% sched_debug.cfs_rq:/.util_avg.max
> 0.28 ą 6% -18.4% 0.23 ą 8% sched_debug.cpu.nr_running.stddev
> 8.736e+08 -3.6% 8.423e+08 stress-ng.pkey.ops
> 14560560 -3.6% 14038795 stress-ng.pkey.ops_per_sec
> 770.39 ą 4% -5.0% 732.04 stress-ng.time.user_time
> 244657 ą 3% +5.8% 258782 ą 3% proc-vmstat.nr_slab_unreclaimable
> 73133541 -2.1% 71588873 proc-vmstat.numa_hit
> 72873579 -2.1% 71357274 proc-vmstat.numa_local
> 1.842e+08 -2.5% 1.796e+08 proc-vmstat.pgalloc_normal
> 1.767e+08 -2.8% 1.717e+08 proc-vmstat.pgfree
> 1345346 ą 40% -73.1% 362064 ą124% numa-vmstat.node0.nr_inactive_anon
> 1345340 ą 40% -73.1% 362062 ą124% numa-vmstat.node0.nr_zone_inactive_anon
> 2420830 ą 14% +35.1% 3270248 ą 16% numa-vmstat.node1.nr_file_pages
> 2067871 ą 13% +51.5% 3132982 ą 17% numa-vmstat.node1.nr_inactive_anon
> 191406 ą 17% +33.6% 255808 ą 14% numa-vmstat.node1.nr_mapped
> 2452 ą 61% +104.4% 5012 ą 35% numa-vmstat.node1.nr_page_table_pages
> 2067853 ą 13% +51.5% 3132966 ą 17% numa-vmstat.node1.nr_zone_inactive_anon
> 5379238 ą 40% -73.0% 1453605 ą123% numa-meminfo.node0.Inactive
> 5379166 ą 40% -73.0% 1453462 ą123% numa-meminfo.node0.Inactive(anon)
> 8741077 ą 22% -36.7% 5531290 ą 28% numa-meminfo.node0.MemUsed
> 9651902 ą 13% +35.8% 13105318 ą 16% numa-meminfo.node1.FilePages
> 8239855 ą 13% +52.4% 12556929 ą 17% numa-meminfo.node1.Inactive
> 8239712 ą 13% +52.4% 12556853 ą 17% numa-meminfo.node1.Inactive(anon)
> 761944 ą 18% +34.6% 1025906 ą 14% numa-meminfo.node1.Mapped
> 11679628 ą 11% +31.2% 15322841 ą 14% numa-meminfo.node1.MemUsed
> 9874 ą 62% +104.6% 20200 ą 36% numa-meminfo.node1.PageTables
> 0.74 -4.2% 0.71 perf-stat.i.MPKI
> 1.245e+11 +2.3% 1.274e+11 perf-stat.i.branch-instructions
> 0.37 -0.0 0.35 perf-stat.i.branch-miss-rate%
> 4.359e+08 -2.1% 4.265e+08 perf-stat.i.branch-misses
> 4.672e+08 -2.6% 4.548e+08 perf-stat.i.cache-misses
> 7.276e+08 -2.7% 7.082e+08 perf-stat.i.cache-references
> 1.00 -1.6% 0.98 perf-stat.i.cpi
> 1364 +2.9% 1404 perf-stat.i.cycles-between-cache-misses
> 6.392e+11 +1.7% 6.499e+11 perf-stat.i.instructions
> 1.00 +1.6% 1.02 perf-stat.i.ipc
> 0.74 -4.3% 0.71 perf-stat.overall.MPKI
> 0.35 -0.0 0.33 perf-stat.overall.branch-miss-rate%
> 1.00 -1.6% 0.99 perf-stat.overall.cpi
> 1356 +2.9% 1395 perf-stat.overall.cycles-between-cache-misses
> 1.00 +1.6% 1.01 perf-stat.overall.ipc
> 1.209e+11 +1.9% 1.232e+11 perf-stat.ps.branch-instructions
> 4.188e+08 -2.6% 4.077e+08 perf-stat.ps.branch-misses
> 4.585e+08 -3.1% 4.441e+08 perf-stat.ps.cache-misses
> 7.124e+08 -3.1% 6.901e+08 perf-stat.ps.cache-references
> 10321 -2.6% 10053 perf-stat.ps.context-switches
>
>
>
>
>
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
>
>
> --
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki
>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression
2024-08-04 8:59 [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression kernel test robot
2024-08-04 20:32 ` Linus Torvalds
2024-08-05 13:56 ` Jeff Xu
@ 2024-08-05 16:58 ` Jeff Xu
2024-08-06 1:44 ` Oliver Sang
2 siblings, 1 reply; 29+ messages in thread
From: Jeff Xu @ 2024-08-05 16:58 UTC (permalink / raw)
To: kernel test robot
Cc: oe-lkp, lkp, linux-kernel, Andrew Morton, Kees Cook,
Liam R. Howlett, Pedro Falcato, Dave Hansen, Greg Kroah-Hartman,
Guenter Roeck, Jann Horn, Jeff Xu, Jonathan Corbet,
Jorge Lucangeli Obes, Linus Torvalds, Matthew Wilcox,
Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan,
Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api,
linux-mm, ying.huang, feng.tang, fengwei.yin
On Sun, Aug 4, 2024 at 1:59 AM kernel test robot <oliver.sang@intel.com> wrote:
>
>
>
> Hello,
>
> kernel test robot noticed a -4.4% regression of stress-ng.pagemove.page_remaps_per_sec on:
>
>
> commit: 8be7258aad44b5e25977a98db136f677fa6f4370 ("mseal: add mseal syscall")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>
> testcase: stress-ng
> test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
> parameters:
>
> nr_threads: 100%
> testtime: 60s
> test: pagemove
> cpufreq_governor: performance
>
>
> In addition to that, the commit also has significant impact on the following tests:
>
> +------------------+---------------------------------------------------------------------------------------------+
> | testcase: change | stress-ng: stress-ng.pkey.ops_per_sec -3.6% regression |
> | test machine | 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory |
> | test parameters | cpufreq_governor=performance |
> | | nr_threads=100% |
> | | test=pkey |
> | | testtime=60s |
> +------------------+---------------------------------------------------------------------------------------------+
>
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <oliver.sang@intel.com>
> | Closes: https://lore.kernel.org/oe-lkp/202408041602.caa0372-oliver.sang@intel.com
>
>
> Details are as below:
> -------------------------------------------------------------------------------------------------->
>
>
> The kernel config and materials to reproduce are available at:
> https://download.01.org/0day-ci/archive/20240804/202408041602.caa0372-oliver.sang@intel.com
>
There is an error when I try to reproduce the test:
bin/lkp install job.yaml
--------------------------------------------------------
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:
The following packages have unmet dependencies:
libdw1 : Depends: libelf1 (= 0.190-1+b1)
libdw1t64 : Breaks: libdw1 (< 0.191-2)
E: Unable to correct problems, you have held broken packages.
Cannot install some packages of perf-c2c depends
-----------------------------------------------------------------------------------------
And where is stress-ng.pagemove.page_remaps_per_sec test implemented,
is that part of lkp-tests ?
Thanks
-Jeff
> =========================================================================================
> compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
> gcc-13/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/pagemove/stress-ng/60s
>
> commit:
> ff388fe5c4 ("mseal: wire up mseal syscall")
> 8be7258aad ("mseal: add mseal syscall")
>
> ff388fe5c481d39c 8be7258aad44b5e25977a98db13
> ---------------- ---------------------------
> %stddev %change %stddev
> \ | \
> 41625945 -4.3% 39842322 proc-vmstat.numa_hit
> 41559175 -4.3% 39774160 proc-vmstat.numa_local
> 77484314 -4.4% 74105555 proc-vmstat.pgalloc_normal
> 77205752 -4.4% 73826672 proc-vmstat.pgfree
> 18361466 -4.2% 17596652 stress-ng.pagemove.ops
> 306014 -4.2% 293262 stress-ng.pagemove.ops_per_sec
> 205312 -4.4% 196176 stress-ng.pagemove.page_remaps_per_sec
> 4961 +1.0% 5013 stress-ng.time.percent_of_cpu_this_job_got
> 2917 +1.2% 2952 stress-ng.time.system_time
> 1.07 -6.6% 1.00 perf-stat.i.MPKI
> 3.354e+10 +3.5% 3.473e+10 perf-stat.i.branch-instructions
> 1.795e+08 -4.2% 1.719e+08 perf-stat.i.cache-misses
> 2.376e+08 -4.1% 2.279e+08 perf-stat.i.cache-references
> 1.13 -3.0% 1.10 perf-stat.i.cpi
> 1077 +4.3% 1124 perf-stat.i.cycles-between-cache-misses
> 1.717e+11 +2.7% 1.762e+11 perf-stat.i.instructions
> 0.88 +3.1% 0.91 perf-stat.i.ipc
> 1.05 -6.8% 0.97 perf-stat.overall.MPKI
> 0.25 ą 2% -0.0 0.24 perf-stat.overall.branch-miss-rate%
> 1.13 -3.0% 1.10 perf-stat.overall.cpi
> 1084 +4.0% 1127 perf-stat.overall.cycles-between-cache-misses
> 0.88 +3.1% 0.91 perf-stat.overall.ipc
> 3.298e+10 +3.5% 3.415e+10 perf-stat.ps.branch-instructions
> 1.764e+08 -4.3% 1.689e+08 perf-stat.ps.cache-misses
> 2.336e+08 -4.1% 2.24e+08 perf-stat.ps.cache-references
> 194.57 -2.4% 189.96 ą 2% perf-stat.ps.cpu-migrations
> 1.688e+11 +2.7% 1.733e+11 perf-stat.ps.instructions
> 1.036e+13 +3.0% 1.068e+13 perf-stat.total.instructions
> 75.12 -1.9 73.22 perf-profile.calltrace.cycles-pp.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
> 36.84 -1.6 35.29 perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64
> 24.90 -1.2 23.72 perf-profile.calltrace.cycles-pp.copy_vma.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 19.89 -0.9 18.98 perf-profile.calltrace.cycles-pp.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
> 10.56 ą 2% -0.8 9.78 ą 2% perf-profile.calltrace.cycles-pp.rcu_core.handle_softirqs.run_ksoftirqd.smpboot_thread_fn.kthread
> 10.56 ą 2% -0.8 9.79 ą 2% perf-profile.calltrace.cycles-pp.handle_softirqs.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork
> 10.56 ą 2% -0.8 9.79 ą 2% perf-profile.calltrace.cycles-pp.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
> 10.57 ą 2% -0.8 9.80 ą 2% perf-profile.calltrace.cycles-pp.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
> 10.52 ą 2% -0.8 9.75 ą 2% perf-profile.calltrace.cycles-pp.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd.smpboot_thread_fn
> 10.62 ą 2% -0.8 9.85 ą 2% perf-profile.calltrace.cycles-pp.kthread.ret_from_fork.ret_from_fork_asm
> 10.62 ą 2% -0.8 9.85 ą 2% perf-profile.calltrace.cycles-pp.ret_from_fork.ret_from_fork_asm
> 10.62 ą 2% -0.8 9.85 ą 2% perf-profile.calltrace.cycles-pp.ret_from_fork_asm
> 14.75 -0.7 14.07 perf-profile.calltrace.cycles-pp.vma_merge.copy_vma.move_vma.__do_sys_mremap.do_syscall_64
> 1.50 -0.6 0.94 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64
> 5.88 ą 2% -0.4 5.47 ą 2% perf-profile.calltrace.cycles-pp.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd
> 7.80 -0.3 7.47 perf-profile.calltrace.cycles-pp.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 4.55 ą 2% -0.3 4.24 ą 2% perf-profile.calltrace.cycles-pp.__slab_free.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs
> 6.76 -0.3 6.45 perf-profile.calltrace.cycles-pp.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma
> 6.15 -0.3 5.86 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap
> 8.22 -0.3 7.93 perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
> 6.12 -0.3 5.87 perf-profile.calltrace.cycles-pp.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma
> 5.74 -0.2 5.50 perf-profile.calltrace.cycles-pp.move_ptes.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64
> 3.16 ą 2% -0.2 2.94 perf-profile.calltrace.cycles-pp.vm_area_free_rcu_cb.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd
> 5.50 -0.2 5.28 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap
> 1.36 -0.2 1.14 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap
> 5.15 -0.2 4.94 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap
> 5.51 -0.2 5.31 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_merge.copy_vma.move_vma.__do_sys_mremap
> 5.16 -0.2 4.97 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma.move_vma
> 2.24 -0.2 2.05 perf-profile.calltrace.cycles-pp.find_vma_prev.copy_vma.move_vma.__do_sys_mremap.do_syscall_64
> 2.60 ą 2% -0.2 2.42 ą 2% perf-profile.calltrace.cycles-pp.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core.handle_softirqs
> 4.67 -0.2 4.49 perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma
> 3.41 -0.2 3.23 perf-profile.calltrace.cycles-pp.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma
> 3.00 -0.2 2.83 ą 2% perf-profile.calltrace.cycles-pp.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma
> 0.96 -0.2 0.80 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.do_munmap.mremap_to
> 4.04 -0.2 3.88 perf-profile.calltrace.cycles-pp.vm_area_dup.copy_vma.move_vma.__do_sys_mremap.do_syscall_64
> 3.20 ą 2% -0.2 3.04 ą 2% perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap
> 3.53 -0.1 3.38 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma.__do_sys_mremap
> 3.40 -0.1 3.26 perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma.__do_sys_mremap
> 2.20 ą 2% -0.1 2.06 ą 2% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap
> 1.84 ą 3% -0.1 1.71 ą 3% perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.anon_vma_clone.__split_vma.do_vmi_align_munmap
> 1.78 ą 2% -0.1 1.65 ą 3% perf-profile.calltrace.cycles-pp.vma_prepare.vma_merge.copy_vma.move_vma.__do_sys_mremap
> 2.69 -0.1 2.56 perf-profile.calltrace.cycles-pp.unlink_anon_vmas.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap
> 1.78 ą 2% -0.1 1.66 ą 2% perf-profile.calltrace.cycles-pp.__slab_free.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core
> 1.36 ą 2% -0.1 1.23 ą 2% perf-profile.calltrace.cycles-pp.rcu_cblist_dequeue.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd
> 0.95 -0.1 0.83 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap
> 3.29 -0.1 3.17 perf-profile.calltrace.cycles-pp.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
> 2.08 -0.1 1.96 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_merge.copy_vma.move_vma.__do_sys_mremap
> 1.43 ą 3% -0.1 1.32 ą 3% perf-profile.calltrace.cycles-pp.down_write.vma_prepare.vma_merge.copy_vma.move_vma
> 2.21 -0.1 2.10 perf-profile.calltrace.cycles-pp.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma
> 2.47 -0.1 2.36 perf-profile.calltrace.cycles-pp.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma
> 2.21 -0.1 2.12 perf-profile.calltrace.cycles-pp.native_flush_tlb_one_user.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables
> 1.41 -0.1 1.32 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap
> 1.26 -0.1 1.18 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap
> 1.82 -0.1 1.75 perf-profile.calltrace.cycles-pp.vma_link.copy_vma.move_vma.__do_sys_mremap.do_syscall_64
> 0.71 -0.1 0.63 perf-profile.calltrace.cycles-pp.mas_prev.vma_merge.copy_vma.move_vma.__do_sys_mremap
> 1.29 -0.1 1.22 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma
> 0.61 -0.1 0.54 perf-profile.calltrace.cycles-pp.mas_prev_setup.mas_prev.vma_merge.copy_vma.move_vma
> 1.36 -0.1 1.29 perf-profile.calltrace.cycles-pp.kmem_cache_free.unlink_anon_vmas.free_pgtables.unmap_region.do_vmi_align_munmap
> 1.40 -0.1 1.33 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma.move_vma
> 0.70 -0.1 0.64 perf-profile.calltrace.cycles-pp.mas_walk.find_vma_prev.copy_vma.move_vma.__do_sys_mremap
> 1.23 -0.1 1.17 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma
> 1.66 -0.1 1.60 perf-profile.calltrace.cycles-pp.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 1.16 -0.1 1.10 perf-profile.calltrace.cycles-pp.anon_vma_clone.copy_vma.move_vma.__do_sys_mremap.do_syscall_64
> 0.96 -0.1 0.90 perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.unlink_anon_vmas.free_pgtables.unmap_region
> 1.14 -0.1 1.08 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap
> 0.79 -0.1 0.74 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma
> 1.04 -0.1 1.00 perf-profile.calltrace.cycles-pp.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 0.58 -0.0 0.53 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.mremap
> 0.61 -0.0 0.56 perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core
> 0.56 -0.0 0.52 perf-profile.calltrace.cycles-pp.mas_find.find_vma_prev.copy_vma.move_vma.__do_sys_mremap
> 0.57 -0.0 0.53 ą 2% perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs
> 0.78 -0.0 0.74 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge
> 0.88 -0.0 0.84 perf-profile.calltrace.cycles-pp.mtree_load.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64
> 0.70 -0.0 0.66 perf-profile.calltrace.cycles-pp.__call_rcu_common.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
> 0.68 -0.0 0.64 perf-profile.calltrace.cycles-pp.percpu_counter_add_batch.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
> 0.68 -0.0 0.64 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.mremap
> 0.97 -0.0 0.93 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
> 1.11 -0.0 1.08 perf-profile.calltrace.cycles-pp.clear_bhb_loop.mremap
> 0.75 -0.0 0.72 perf-profile.calltrace.cycles-pp.allocate_slab.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma
> 0.74 -0.0 0.71 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.anon_vma_clone.copy_vma.move_vma.__do_sys_mremap
> 0.60 ą 2% -0.0 0.57 perf-profile.calltrace.cycles-pp.security_mmap_addr.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64
> 0.67 ą 2% -0.0 0.64 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma
> 0.82 -0.0 0.79 perf-profile.calltrace.cycles-pp.mtree_load.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
> 0.63 -0.0 0.60 perf-profile.calltrace.cycles-pp.mas_prev_slot.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
> 0.99 -0.0 0.96 perf-profile.calltrace.cycles-pp.mt_find.vma_merge.copy_vma.move_vma.__do_sys_mremap
> 0.62 ą 2% -0.0 0.59 perf-profile.calltrace.cycles-pp.get_old_pud.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64
> 0.87 -0.0 0.84 perf-profile.calltrace.cycles-pp.userfaultfd_unmap_complete.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
> 0.78 -0.0 0.75 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_link.copy_vma.move_vma.__do_sys_mremap
> 0.64 -0.0 0.62 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_link.copy_vma.move_vma.__do_sys_mremap
> 0.90 -0.0 0.87 perf-profile.calltrace.cycles-pp.mtree_load.vma_merge.copy_vma.move_vma.__do_sys_mremap
> 0.54 -0.0 0.52 perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.move_ptes.move_page_tables.move_vma.__do_sys_mremap
> 1.04 +0.0 1.08 perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region
> 0.76 +0.1 0.83 perf-profile.calltrace.cycles-pp.__madvise
> 0.63 +0.1 0.70 perf-profile.calltrace.cycles-pp.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
> 0.62 +0.1 0.70 perf-profile.calltrace.cycles-pp.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
> 0.66 +0.1 0.74 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__madvise
> 0.66 +0.1 0.74 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
> 87.74 +0.7 88.45 perf-profile.calltrace.cycles-pp.mremap
> 0.00 +0.9 0.86 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.do_munmap
> 0.00 +0.9 0.86 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.mremap_to.__do_sys_mremap
> 84.88 +0.9 85.77 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.mremap
> 84.73 +0.9 85.62 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
> 0.00 +0.9 0.92 ą 2% perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.move_vma
> 83.84 +0.9 84.78 perf-profile.calltrace.cycles-pp.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
> 0.00 +1.1 1.06 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64
> 0.00 +1.2 1.21 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to
> 2.07 +1.5 3.55 perf-profile.calltrace.cycles-pp.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 1.58 +1.5 3.07 perf-profile.calltrace.cycles-pp.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64
> 0.00 +1.5 1.52 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap
> 0.00 +1.6 1.57 perf-profile.calltrace.cycles-pp.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 0.00 +1.7 1.72 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap
> 0.00 +2.0 2.01 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64
> 5.39 +2.9 8.32 perf-profile.calltrace.cycles-pp.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
> 75.29 -1.9 73.37 perf-profile.children.cycles-pp.move_vma
> 37.06 -1.6 35.50 perf-profile.children.cycles-pp.do_vmi_align_munmap
> 24.98 -1.2 23.80 perf-profile.children.cycles-pp.copy_vma
> 19.99 -1.0 19.02 perf-profile.children.cycles-pp.handle_softirqs
> 19.97 -1.0 19.00 perf-profile.children.cycles-pp.rcu_core
> 19.95 -1.0 18.98 perf-profile.children.cycles-pp.rcu_do_batch
> 19.98 -0.9 19.06 perf-profile.children.cycles-pp.__split_vma
> 17.55 -0.8 16.76 perf-profile.children.cycles-pp.kmem_cache_free
> 10.56 ą 2% -0.8 9.79 ą 2% perf-profile.children.cycles-pp.run_ksoftirqd
> 10.57 ą 2% -0.8 9.80 ą 2% perf-profile.children.cycles-pp.smpboot_thread_fn
> 15.38 -0.8 14.62 perf-profile.children.cycles-pp.kmem_cache_alloc_noprof
> 10.62 ą 2% -0.8 9.85 ą 2% perf-profile.children.cycles-pp.kthread
> 10.62 ą 2% -0.8 9.86 ą 2% perf-profile.children.cycles-pp.ret_from_fork
> 10.62 ą 2% -0.8 9.86 ą 2% perf-profile.children.cycles-pp.ret_from_fork_asm
> 15.14 -0.7 14.44 perf-profile.children.cycles-pp.vma_merge
> 12.08 -0.5 11.55 perf-profile.children.cycles-pp.__slab_free
> 12.11 -0.5 11.62 perf-profile.children.cycles-pp.mas_wr_store_entry
> 10.86 -0.5 10.39 perf-profile.children.cycles-pp.vm_area_dup
> 11.89 -0.5 11.44 perf-profile.children.cycles-pp.mas_store_prealloc
> 8.49 -0.4 8.06 perf-profile.children.cycles-pp.__memcg_slab_post_alloc_hook
> 9.88 -0.4 9.49 perf-profile.children.cycles-pp.mas_wr_node_store
> 7.91 -0.3 7.58 perf-profile.children.cycles-pp.move_page_tables
> 6.06 -0.3 5.78 perf-profile.children.cycles-pp.vm_area_free_rcu_cb
> 8.28 -0.3 8.00 perf-profile.children.cycles-pp.unmap_region
> 6.69 -0.3 6.42 perf-profile.children.cycles-pp.vma_complete
> 5.06 -0.3 4.80 perf-profile.children.cycles-pp.mas_preallocate
> 5.82 -0.2 5.57 perf-profile.children.cycles-pp.move_ptes
> 4.24 -0.2 4.01 perf-profile.children.cycles-pp.anon_vma_clone
> 3.50 -0.2 3.30 perf-profile.children.cycles-pp.down_write
> 2.44 -0.2 2.25 perf-profile.children.cycles-pp.find_vma_prev
> 3.46 -0.2 3.28 perf-profile.children.cycles-pp.___slab_alloc
> 3.45 -0.2 3.27 perf-profile.children.cycles-pp.free_pgtables
> 2.54 -0.2 2.37 perf-profile.children.cycles-pp.rcu_cblist_dequeue
> 3.35 -0.2 3.18 perf-profile.children.cycles-pp.__memcg_slab_free_hook
> 2.93 -0.2 2.78 perf-profile.children.cycles-pp.mas_alloc_nodes
> 2.28 ą 2% -0.2 2.12 ą 2% perf-profile.children.cycles-pp.vma_prepare
> 3.46 -0.1 3.32 perf-profile.children.cycles-pp.flush_tlb_mm_range
> 3.41 -0.1 3.27 ą 2% perf-profile.children.cycles-pp.mod_objcg_state
> 2.76 -0.1 2.63 perf-profile.children.cycles-pp.unlink_anon_vmas
> 3.41 -0.1 3.28 perf-profile.children.cycles-pp.mas_store_gfp
> 2.21 -0.1 2.09 perf-profile.children.cycles-pp.__cond_resched
> 2.04 -0.1 1.94 perf-profile.children.cycles-pp.allocate_slab
> 2.10 -0.1 2.00 perf-profile.children.cycles-pp.__call_rcu_common
> 2.51 -0.1 2.40 perf-profile.children.cycles-pp.flush_tlb_func
> 1.04 -0.1 0.94 perf-profile.children.cycles-pp.mas_prev
> 2.71 -0.1 2.61 perf-profile.children.cycles-pp.mtree_load
> 2.23 -0.1 2.14 perf-profile.children.cycles-pp.native_flush_tlb_one_user
> 0.22 ą 5% -0.1 0.13 ą 13% perf-profile.children.cycles-pp.vm_stat_account
> 0.95 -0.1 0.87 perf-profile.children.cycles-pp.mas_prev_setup
> 1.65 -0.1 1.57 perf-profile.children.cycles-pp.mas_wr_walk
> 1.84 -0.1 1.76 perf-profile.children.cycles-pp.up_write
> 1.27 -0.1 1.20 perf-profile.children.cycles-pp.mas_prev_slot
> 1.84 -0.1 1.77 perf-profile.children.cycles-pp.vma_link
> 1.39 -0.1 1.32 perf-profile.children.cycles-pp.shuffle_freelist
> 0.96 -0.1 0.90 ą 2% perf-profile.children.cycles-pp.rcu_all_qs
> 0.86 -0.1 0.80 perf-profile.children.cycles-pp._raw_spin_lock_irqsave
> 1.70 -0.1 1.64 perf-profile.children.cycles-pp.__get_unmapped_area
> 0.34 ą 3% -0.1 0.29 ą 5% perf-profile.children.cycles-pp.security_vm_enough_memory_mm
> 0.60 -0.0 0.55 perf-profile.children.cycles-pp.entry_SYSCALL_64
> 0.92 -0.0 0.87 perf-profile.children.cycles-pp.percpu_counter_add_batch
> 1.07 -0.0 1.02 perf-profile.children.cycles-pp.vma_to_resize
> 1.59 -0.0 1.54 perf-profile.children.cycles-pp.mas_update_gap
> 0.44 ą 2% -0.0 0.40 ą 2% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
> 0.70 -0.0 0.66 perf-profile.children.cycles-pp.syscall_return_via_sysret
> 1.13 -0.0 1.09 perf-profile.children.cycles-pp.mt_find
> 0.20 ą 6% -0.0 0.17 ą 9% perf-profile.children.cycles-pp.cap_vm_enough_memory
> 0.99 -0.0 0.95 perf-profile.children.cycles-pp.mas_pop_node
> 0.63 ą 2% -0.0 0.59 perf-profile.children.cycles-pp.security_mmap_addr
> 0.62 -0.0 0.59 perf-profile.children.cycles-pp.__put_partials
> 1.17 -0.0 1.14 perf-profile.children.cycles-pp.clear_bhb_loop
> 0.46 -0.0 0.43 ą 2% perf-profile.children.cycles-pp.__alloc_pages_noprof
> 0.44 -0.0 0.41 ą 2% perf-profile.children.cycles-pp.get_page_from_freelist
> 0.90 -0.0 0.87 perf-profile.children.cycles-pp.userfaultfd_unmap_complete
> 0.64 ą 2% -0.0 0.62 perf-profile.children.cycles-pp.get_old_pud
> 1.07 -0.0 1.05 perf-profile.children.cycles-pp.mas_leaf_max_gap
> 0.22 ą 3% -0.0 0.20 ą 2% perf-profile.children.cycles-pp.__rmqueue_pcplist
> 0.55 -0.0 0.53 perf-profile.children.cycles-pp.refill_obj_stock
> 0.25 -0.0 0.23 ą 3% perf-profile.children.cycles-pp.rmqueue
> 0.48 -0.0 0.45 perf-profile.children.cycles-pp.mremap_userfaultfd_prep
> 0.33 -0.0 0.30 perf-profile.children.cycles-pp.free_unref_page
> 0.46 -0.0 0.44 perf-profile.children.cycles-pp.setup_object
> 0.21 ą 3% -0.0 0.19 ą 2% perf-profile.children.cycles-pp.rmqueue_bulk
> 0.31 ą 3% -0.0 0.29 perf-profile.children.cycles-pp.__vm_enough_memory
> 0.40 -0.0 0.38 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
> 0.36 -0.0 0.35 perf-profile.children.cycles-pp.madvise_vma_behavior
> 0.54 -0.0 0.53 ą 2% perf-profile.children.cycles-pp.mas_wr_end_piv
> 0.46 -0.0 0.44 ą 2% perf-profile.children.cycles-pp.rcu_segcblist_enqueue
> 0.34 -0.0 0.32 ą 2% perf-profile.children.cycles-pp.mas_destroy
> 0.28 -0.0 0.26 ą 3% perf-profile.children.cycles-pp.mas_wr_store_setup
> 0.30 -0.0 0.28 perf-profile.children.cycles-pp.pte_offset_map_nolock
> 0.19 -0.0 0.18 ą 2% perf-profile.children.cycles-pp.__thp_vma_allowable_orders
> 0.08 ą 4% -0.0 0.07 perf-profile.children.cycles-pp.ksm_madvise
> 0.17 -0.0 0.16 perf-profile.children.cycles-pp.get_any_partial
> 0.08 -0.0 0.07 perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
> 0.45 +0.0 0.47 perf-profile.children.cycles-pp._raw_spin_lock
> 1.10 +0.0 1.14 perf-profile.children.cycles-pp.zap_pte_range
> 0.78 +0.1 0.85 perf-profile.children.cycles-pp.__madvise
> 0.63 +0.1 0.70 perf-profile.children.cycles-pp.__x64_sys_madvise
> 0.62 +0.1 0.70 perf-profile.children.cycles-pp.do_madvise
> 0.00 +0.1 0.09 ą 4% perf-profile.children.cycles-pp.can_modify_mm_madv
> 1.32 +0.1 1.46 perf-profile.children.cycles-pp.mas_next_slot
> 88.13 +0.7 88.83 perf-profile.children.cycles-pp.mremap
> 83.94 +0.9 84.88 perf-profile.children.cycles-pp.__do_sys_mremap
> 86.06 +0.9 87.00 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
> 85.56 +1.0 86.54 perf-profile.children.cycles-pp.do_syscall_64
> 40.49 +1.4 41.90 perf-profile.children.cycles-pp.do_vmi_munmap
> 2.10 +1.5 3.57 perf-profile.children.cycles-pp.do_munmap
> 3.62 +2.3 5.90 perf-profile.children.cycles-pp.mas_walk
> 5.44 +2.9 8.38 perf-profile.children.cycles-pp.mremap_to
> 5.30 +3.1 8.39 perf-profile.children.cycles-pp.mas_find
> 0.00 +5.4 5.40 perf-profile.children.cycles-pp.can_modify_mm
> 11.46 -0.5 10.96 perf-profile.self.cycles-pp.__slab_free
> 4.30 -0.2 4.08 perf-profile.self.cycles-pp.__memcg_slab_post_alloc_hook
> 2.51 -0.2 2.34 perf-profile.self.cycles-pp.rcu_cblist_dequeue
> 2.41 ą 2% -0.2 2.25 perf-profile.self.cycles-pp.down_write
> 2.21 -0.1 2.11 perf-profile.self.cycles-pp.native_flush_tlb_one_user
> 2.37 -0.1 2.28 perf-profile.self.cycles-pp.mtree_load
> 1.60 -0.1 1.51 perf-profile.self.cycles-pp.__memcg_slab_free_hook
> 0.18 ą 3% -0.1 0.10 ą 15% perf-profile.self.cycles-pp.vm_stat_account
> 1.25 -0.1 1.18 perf-profile.self.cycles-pp.move_vma
> 1.76 -0.1 1.69 perf-profile.self.cycles-pp.mod_objcg_state
> 1.42 -0.1 1.35 ą 2% perf-profile.self.cycles-pp.__call_rcu_common
> 1.41 -0.1 1.34 perf-profile.self.cycles-pp.mas_wr_walk
> 1.52 -0.1 1.46 perf-profile.self.cycles-pp.up_write
> 1.02 -0.1 0.95 perf-profile.self.cycles-pp.mas_prev_slot
> 0.96 -0.1 0.90 ą 2% perf-profile.self.cycles-pp.vm_area_free_rcu_cb
> 1.50 -0.1 1.45 perf-profile.self.cycles-pp.kmem_cache_free
> 0.69 ą 3% -0.1 0.64 ą 2% perf-profile.self.cycles-pp.rcu_all_qs
> 1.14 ą 2% -0.1 1.09 perf-profile.self.cycles-pp.shuffle_freelist
> 1.10 -0.1 1.05 perf-profile.self.cycles-pp.__cond_resched
> 1.40 -0.0 1.35 perf-profile.self.cycles-pp.do_vmi_align_munmap
> 0.99 -0.0 0.94 perf-profile.self.cycles-pp.mas_preallocate
> 0.88 -0.0 0.83 perf-profile.self.cycles-pp.___slab_alloc
> 0.55 -0.0 0.50 perf-profile.self.cycles-pp.mremap_to
> 0.98 -0.0 0.93 perf-profile.self.cycles-pp.move_ptes
> 0.78 -0.0 0.74 perf-profile.self.cycles-pp.percpu_counter_add_batch
> 0.21 ą 2% -0.0 0.18 ą 2% perf-profile.self.cycles-pp.entry_SYSCALL_64
> 0.44 ą 2% -0.0 0.40 ą 2% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
> 0.92 -0.0 0.89 perf-profile.self.cycles-pp.mas_store_gfp
> 0.86 -0.0 0.82 perf-profile.self.cycles-pp.mas_pop_node
> 0.50 -0.0 0.46 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
> 1.15 -0.0 1.12 perf-profile.self.cycles-pp.clear_bhb_loop
> 1.14 -0.0 1.11 perf-profile.self.cycles-pp.vma_merge
> 0.66 -0.0 0.63 perf-profile.self.cycles-pp.__split_vma
> 0.16 ą 6% -0.0 0.13 ą 7% perf-profile.self.cycles-pp.cap_vm_enough_memory
> 0.82 -0.0 0.79 perf-profile.self.cycles-pp.mas_wr_store_entry
> 0.54 ą 2% -0.0 0.52 perf-profile.self.cycles-pp.get_old_pud
> 0.43 -0.0 0.40 perf-profile.self.cycles-pp.do_munmap
> 0.51 ą 2% -0.0 0.48 ą 2% perf-profile.self.cycles-pp.security_mmap_addr
> 0.50 -0.0 0.48 perf-profile.self.cycles-pp.refill_obj_stock
> 0.24 -0.0 0.22 perf-profile.self.cycles-pp.mas_prev
> 0.71 -0.0 0.69 perf-profile.self.cycles-pp.unmap_page_range
> 0.48 -0.0 0.45 perf-profile.self.cycles-pp.find_vma_prev
> 0.42 -0.0 0.40 perf-profile.self.cycles-pp._raw_spin_lock_irqsave
> 0.66 -0.0 0.64 perf-profile.self.cycles-pp.mas_store_prealloc
> 0.31 -0.0 0.29 perf-profile.self.cycles-pp.mas_prev_setup
> 0.43 -0.0 0.41 perf-profile.self.cycles-pp.mas_wr_end_piv
> 0.78 -0.0 0.76 perf-profile.self.cycles-pp.userfaultfd_unmap_complete
> 0.28 -0.0 0.26 ą 2% perf-profile.self.cycles-pp.mas_put_in_tree
> 0.42 -0.0 0.40 perf-profile.self.cycles-pp.mremap_userfaultfd_prep
> 0.28 -0.0 0.26 perf-profile.self.cycles-pp.free_pgtables
> 0.39 -0.0 0.37 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
> 0.30 ą 2% -0.0 0.28 perf-profile.self.cycles-pp.zap_pmd_range
> 0.32 -0.0 0.31 perf-profile.self.cycles-pp.unmap_vmas
> 0.21 -0.0 0.20 perf-profile.self.cycles-pp.__get_unmapped_area
> 0.18 ą 2% -0.0 0.17 ą 2% perf-profile.self.cycles-pp.lru_add_drain_cpu
> 0.06 -0.0 0.05 perf-profile.self.cycles-pp.ksm_madvise
> 0.45 +0.0 0.46 perf-profile.self.cycles-pp.do_vmi_munmap
> 0.37 +0.0 0.39 perf-profile.self.cycles-pp._raw_spin_lock
> 1.06 +0.1 1.18 perf-profile.self.cycles-pp.mas_next_slot
> 1.50 +0.5 1.97 perf-profile.self.cycles-pp.mas_find
> 0.00 +1.4 1.35 perf-profile.self.cycles-pp.can_modify_mm
> 3.13 +2.0 5.13 perf-profile.self.cycles-pp.mas_walk
>
>
> ***************************************************************************************************
> lkp-spr-r02: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory
> =========================================================================================
> compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
> gcc-13/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-spr-r02/pkey/stress-ng/60s
>
> commit:
> ff388fe5c4 ("mseal: wire up mseal syscall")
> 8be7258aad ("mseal: add mseal syscall")
>
> ff388fe5c481d39c 8be7258aad44b5e25977a98db13
> ---------------- ---------------------------
> %stddev %change %stddev
> \ | \
> 10539 -2.5% 10273 vmstat.system.cs
> 0.28 ą 5% -20.1% 0.22 ą 7% sched_debug.cfs_rq:/.h_nr_running.stddev
> 1419 ą 7% -15.3% 1202 ą 6% sched_debug.cfs_rq:/.util_avg.max
> 0.28 ą 6% -18.4% 0.23 ą 8% sched_debug.cpu.nr_running.stddev
> 8.736e+08 -3.6% 8.423e+08 stress-ng.pkey.ops
> 14560560 -3.6% 14038795 stress-ng.pkey.ops_per_sec
> 770.39 ą 4% -5.0% 732.04 stress-ng.time.user_time
> 244657 ą 3% +5.8% 258782 ą 3% proc-vmstat.nr_slab_unreclaimable
> 73133541 -2.1% 71588873 proc-vmstat.numa_hit
> 72873579 -2.1% 71357274 proc-vmstat.numa_local
> 1.842e+08 -2.5% 1.796e+08 proc-vmstat.pgalloc_normal
> 1.767e+08 -2.8% 1.717e+08 proc-vmstat.pgfree
> 1345346 ą 40% -73.1% 362064 ą124% numa-vmstat.node0.nr_inactive_anon
> 1345340 ą 40% -73.1% 362062 ą124% numa-vmstat.node0.nr_zone_inactive_anon
> 2420830 ą 14% +35.1% 3270248 ą 16% numa-vmstat.node1.nr_file_pages
> 2067871 ą 13% +51.5% 3132982 ą 17% numa-vmstat.node1.nr_inactive_anon
> 191406 ą 17% +33.6% 255808 ą 14% numa-vmstat.node1.nr_mapped
> 2452 ą 61% +104.4% 5012 ą 35% numa-vmstat.node1.nr_page_table_pages
> 2067853 ą 13% +51.5% 3132966 ą 17% numa-vmstat.node1.nr_zone_inactive_anon
> 5379238 ą 40% -73.0% 1453605 ą123% numa-meminfo.node0.Inactive
> 5379166 ą 40% -73.0% 1453462 ą123% numa-meminfo.node0.Inactive(anon)
> 8741077 ą 22% -36.7% 5531290 ą 28% numa-meminfo.node0.MemUsed
> 9651902 ą 13% +35.8% 13105318 ą 16% numa-meminfo.node1.FilePages
> 8239855 ą 13% +52.4% 12556929 ą 17% numa-meminfo.node1.Inactive
> 8239712 ą 13% +52.4% 12556853 ą 17% numa-meminfo.node1.Inactive(anon)
> 761944 ą 18% +34.6% 1025906 ą 14% numa-meminfo.node1.Mapped
> 11679628 ą 11% +31.2% 15322841 ą 14% numa-meminfo.node1.MemUsed
> 9874 ą 62% +104.6% 20200 ą 36% numa-meminfo.node1.PageTables
> 0.74 -4.2% 0.71 perf-stat.i.MPKI
> 1.245e+11 +2.3% 1.274e+11 perf-stat.i.branch-instructions
> 0.37 -0.0 0.35 perf-stat.i.branch-miss-rate%
> 4.359e+08 -2.1% 4.265e+08 perf-stat.i.branch-misses
> 4.672e+08 -2.6% 4.548e+08 perf-stat.i.cache-misses
> 7.276e+08 -2.7% 7.082e+08 perf-stat.i.cache-references
> 1.00 -1.6% 0.98 perf-stat.i.cpi
> 1364 +2.9% 1404 perf-stat.i.cycles-between-cache-misses
> 6.392e+11 +1.7% 6.499e+11 perf-stat.i.instructions
> 1.00 +1.6% 1.02 perf-stat.i.ipc
> 0.74 -4.3% 0.71 perf-stat.overall.MPKI
> 0.35 -0.0 0.33 perf-stat.overall.branch-miss-rate%
> 1.00 -1.6% 0.99 perf-stat.overall.cpi
> 1356 +2.9% 1395 perf-stat.overall.cycles-between-cache-misses
> 1.00 +1.6% 1.01 perf-stat.overall.ipc
> 1.209e+11 +1.9% 1.232e+11 perf-stat.ps.branch-instructions
> 4.188e+08 -2.6% 4.077e+08 perf-stat.ps.branch-misses
> 4.585e+08 -3.1% 4.441e+08 perf-stat.ps.cache-misses
> 7.124e+08 -3.1% 6.901e+08 perf-stat.ps.cache-references
> 10321 -2.6% 10053 perf-stat.ps.context-switches
>
>
>
>
>
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
>
>
> --
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki
>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression
2024-08-04 20:32 ` Linus Torvalds
2024-08-05 13:33 ` Pedro Falcato
@ 2024-08-05 17:54 ` Jeff Xu
1 sibling, 0 replies; 29+ messages in thread
From: Jeff Xu @ 2024-08-05 17:54 UTC (permalink / raw)
To: Linus Torvalds
Cc: kernel test robot, oe-lkp, lkp, linux-kernel, Andrew Morton,
Kees Cook, Liam R. Howlett, Pedro Falcato, Dave Hansen,
Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jeff Xu,
Jonathan Corbet, Jorge Lucangeli Obes, Matthew Wilcox,
Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan,
Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api,
linux-mm, ying.huang, feng.tang, fengwei.yin
On Sun, Aug 4, 2024 at 1:33 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Sun, 4 Aug 2024 at 01:59, kernel test robot <oliver.sang@intel.com> wrote:
> >
> > kernel test robot noticed a -4.4% regression of stress-ng.pagemove.page_remaps_per_sec on
> > commit 8be7258aad44 ("mseal: add mseal syscall")
>
> Ok, it's basically just the vma walk in can_modify_mm():
>
> > 1.06 +0.1 1.18 perf-profile.self.cycles-pp.mas_next_slot
> > 1.50 +0.5 1.97 perf-profile.self.cycles-pp.mas_find
> > 0.00 +1.4 1.35 perf-profile.self.cycles-pp.can_modify_mm
> > 3.13 +2.0 5.13 perf-profile.self.cycles-pp.mas_walk
>
> and looks like it's two different pathways. We have __do_sys_mremap ->
> mremap_to -> do_munmap -> do_vmi_munmap -> can_modify_mm for the
> destination mapping, but we also have mremap_to() calling
> can_modify_mm() directly for the source mapping.
>
There are two scenarios in mremap syscall.
1> mremap_to (relocate vma)
2> shrink/expand.
Those two scenarios are handled by different code path:
For case 1>
mremap_to (relocate vma)
-> can_modify_mm , check src for sealing.
-> if MREMAP_FIXED
->-> do_munmap (dst) // free dst
->->-> do_vmi_munmap (dst)
->->->-> can_modify_mm (dst) // check dst for sealing
-> if dst size is smaller (shrink case)
->-> do_munmap(dst, to remove extra size)
->->-> do_vmi_munmap
->->->-> can_modify_mm(dst) (potentially duplicate with check for
MREMAP_FIXED, practically, the memory should be unmapped, so the cost
looking for a un-existed memory range in the maple tree )
For case 2>
Shrink/Expand.
-> can_modify_mm, check addr is sealed
-> if dst size is smaller (shrink case)
->-> do_vmi_munmap(remove_extra_size)
-> ->-> can_modify_mm(addr) (This is redundant because addr is already checked)
For case 2:, potentially we can improve it by passing a flag into
do_vmi_munmap() to indicate the sealing is already checked by the
caller. (however, this idea have to be tested to show actual gain)
The reported regression is in mremap, I wonder why mprotect/munmap
doesn't have similar impact, since they use the same pattern (one
extra out-of-place check for memory range)
During version 9, I tested munmap/mprotect/madvise for perf [1] . The
test shows mseal adds 20-40 ns or 50-100 CPU cycle pre call, this is
much smaller (one tenth) than change from 5.10 to 6.8. The test is
using multiple VMAs with various types[2]. The next step for me is
to run the stress-ng.pagemove.page_remaps_per_sec to understand why
mremap shows a big regression number.
[1] https://lore.kernel.org/all/20240214151130.616240-1-jeffxu@chromium.org/
[2] https://github.com/peaktocreek/mmperf
Best regards,
-Jeff
> And then do_vmi_munmap() will do it's *own* vma_find() after having
> done arch_unmap().
>
> And do_munmap() will obviously do its own vma lookup as part of
> calling vma_to_resize().
>
> So it looks like a large portion of this regression is because the
> mseal addition just ends up walking the vma list way too much.
>
> Linus
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression
2024-08-05 13:33 ` Pedro Falcato
@ 2024-08-05 18:10 ` Jeff Xu
2024-08-05 18:55 ` Linus Torvalds
0 siblings, 1 reply; 29+ messages in thread
From: Jeff Xu @ 2024-08-05 18:10 UTC (permalink / raw)
To: Pedro Falcato
Cc: Linus Torvalds, kernel test robot, Jeff Xu, oe-lkp, lkp,
linux-kernel, Andrew Morton, Kees Cook, Liam R. Howlett,
Dave Hansen, Greg Kroah-Hartman, Guenter Roeck, Jann Horn,
Jonathan Corbet, Jorge Lucangeli Obes, Matthew Wilcox,
Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan,
Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api,
linux-mm, ying.huang, feng.tang, fengwei.yin
On Mon, Aug 5, 2024 at 6:33 AM Pedro Falcato <pedro.falcato@gmail.com> wrote:
>
> On Sun, Aug 4, 2024 at 9:33 PM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > On Sun, 4 Aug 2024 at 01:59, kernel test robot <oliver.sang@intel.com> wrote:
> > >
> > > kernel test robot noticed a -4.4% regression of stress-ng.pagemove.page_remaps_per_sec on
> > > commit 8be7258aad44 ("mseal: add mseal syscall")
> >
> > Ok, it's basically just the vma walk in can_modify_mm():
> >
> > > 1.06 +0.1 1.18 perf-profile.self.cycles-pp.mas_next_slot
> > > 1.50 +0.5 1.97 perf-profile.self.cycles-pp.mas_find
> > > 0.00 +1.4 1.35 perf-profile.self.cycles-pp.can_modify_mm
> > > 3.13 +2.0 5.13 perf-profile.self.cycles-pp.mas_walk
> >
> > and looks like it's two different pathways. We have __do_sys_mremap ->
> > mremap_to -> do_munmap -> do_vmi_munmap -> can_modify_mm for the
> > destination mapping, but we also have mremap_to() calling
> > can_modify_mm() directly for the source mapping.
> >
> > And then do_vmi_munmap() will do it's *own* vma_find() after having
> > done arch_unmap().
> >
> > And do_munmap() will obviously do its own vma lookup as part of
> > calling vma_to_resize().
> >
> > So it looks like a large portion of this regression is because the
> > mseal addition just ends up walking the vma list way too much.
>
> Can we rollback the upfront checks "funny business" and just call
> can_modify_vma directly in relevant places? I still don't believe in
> the partial mprotect/munmap "security risks" that were stated in the
> mseal thread (and these operations can already fail for many other
> reasons than mseal) :)
>
In-place check and extra loop, implemented properly, will both prevent
changing to the sealed memory.
However, extra loop will make attacker difficult to call munmap(0,
random large-size), because if one of vma in the range is sealed, the
whole operation will be no-op.
> I don't mind taking a look myself, just want to make sure I'm not
> stepping on anyone's toes here.
>
One thing that you can't walk around is that can_modify_mm must be
called prior to arch_unmap, that means in-place check for the munmap
is not possible.
( There are recent patch / refactor by Liam R. Howlett in this area,
but I am not sure if this restriction is removed)
> --
> Pedro
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression
2024-08-05 18:10 ` Jeff Xu
@ 2024-08-05 18:55 ` Linus Torvalds
2024-08-05 19:33 ` Linus Torvalds
2024-08-05 19:37 ` Jeff Xu
0 siblings, 2 replies; 29+ messages in thread
From: Linus Torvalds @ 2024-08-05 18:55 UTC (permalink / raw)
To: Jeff Xu, Michael Ellerman, Nicholas Piggin, Christophe Leroy
Cc: Pedro Falcato, kernel test robot, Jeff Xu, oe-lkp, lkp,
linux-kernel, Andrew Morton, Kees Cook, Liam R. Howlett,
Dave Hansen, Greg Kroah-Hartman, Guenter Roeck, Jann Horn,
Jonathan Corbet, Jorge Lucangeli Obes, Matthew Wilcox,
Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan,
Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api,
linux-mm, ying.huang, feng.tang, fengwei.yin
[-- Attachment #1: Type: text/plain, Size: 1685 bytes --]
On Mon, 5 Aug 2024 at 11:11, Jeff Xu <jeffxu@google.com> wrote:
>
> One thing that you can't walk around is that can_modify_mm must be
> called prior to arch_unmap, that means in-place check for the munmap
> is not possible.
Actually, we should move 'arch_unmap()'.
There is only one user of it, and it's pretty pointless.
(Ok, there are two users - x86 also has an 'arch_unmap()', but it's empty).
The reason I say that the current user of arch_unmap() is pointless is
because this is what the powerpc user does:
static inline void arch_unmap(struct mm_struct *mm,
unsigned long start, unsigned long end)
{
unsigned long vdso_base = (unsigned long)mm->context.vdso;
if (start <= vdso_base && vdso_base < end)
mm->context.vdso = NULL;
}
and that would make sense if we didn't have an actual 'vma' that
matched the vdso. But we do.
I think this code may predate the whole "create a vma for the vdso"
code. Or maybe it was just always confused.
Anyway, what the code *should* do is that we should just have a
->close() function for special mappings, and call that in
special_mapping_close().
This is an ENTIRELY UNTESTED patch that gets rid of this horrendous wart.
Michael / Nick / Christophe? Note that I didn't even compile-test this
on x86-64, much less on powerpc.
So please consider this a "maybe something like this" patch, but that
'arch_unmap()' really is pretty nasty.
Oh, and there was a bug in the error path of the powerpc vdso setup
code anyway. The patch fixes that too, although considering the
entirely untested nature of it, the "fixes" is laughably optimistic.
Linus
[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 6309 bytes --]
arch/powerpc/include/asm/mmu_context.h | 9 ---------
arch/powerpc/kernel/vdso.c | 12 +++++++++++-
arch/x86/include/asm/mmu_context.h | 5 -----
include/asm-generic/mm_hooks.h | 11 +++--------
include/linux/mm_types.h | 2 ++
mm/mmap.c | 15 ++++++---------
6 files changed, 22 insertions(+), 32 deletions(-)
diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index 37bffa0f7918..a334a1368848 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -260,15 +260,6 @@ static inline void enter_lazy_tlb(struct mm_struct *mm,
extern void arch_exit_mmap(struct mm_struct *mm);
-static inline void arch_unmap(struct mm_struct *mm,
- unsigned long start, unsigned long end)
-{
- unsigned long vdso_base = (unsigned long)mm->context.vdso;
-
- if (start <= vdso_base && vdso_base < end)
- mm->context.vdso = NULL;
-}
-
#ifdef CONFIG_PPC_MEM_KEYS
bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write,
bool execute, bool foreign);
diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c
index 7a2ff9010f17..4de8af43f920 100644
--- a/arch/powerpc/kernel/vdso.c
+++ b/arch/powerpc/kernel/vdso.c
@@ -81,12 +81,20 @@ static int vdso64_mremap(const struct vm_special_mapping *sm, struct vm_area_str
return vdso_mremap(sm, new_vma, &vdso64_end - &vdso64_start);
}
+static int vvar_close(const struct vm_special_mapping *sm,
+ struct vm_area_struct *vma)
+{
+ struct mm_struct *mm = vma->vm_mm;
+ mm->context.vdso = NULL;
+}
+
static vm_fault_t vvar_fault(const struct vm_special_mapping *sm,
struct vm_area_struct *vma, struct vm_fault *vmf);
static struct vm_special_mapping vvar_spec __ro_after_init = {
.name = "[vvar]",
.fault = vvar_fault,
+ .close = vvar_close,
};
static struct vm_special_mapping vdso32_spec __ro_after_init = {
@@ -207,8 +215,10 @@ static int __arch_setup_additional_pages(struct linux_binprm *bprm, int uses_int
vma = _install_special_mapping(mm, vdso_base, vvar_size,
VM_READ | VM_MAYREAD | VM_IO |
VM_DONTDUMP | VM_PFNMAP, &vvar_spec);
- if (IS_ERR(vma))
+ if (IS_ERR(vma)) {
+ mm->context.vdso = NULL;
return PTR_ERR(vma);
+ }
/*
* our vma flags don't have VM_WRITE so by default, the process isn't
diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
index 8dac45a2c7fc..80f2a3187aa6 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -232,11 +232,6 @@ static inline bool is_64bit_mm(struct mm_struct *mm)
}
#endif
-static inline void arch_unmap(struct mm_struct *mm, unsigned long start,
- unsigned long end)
-{
-}
-
/*
* We only want to enforce protection keys on the current process
* because we effectively have no access to PKRU for other
diff --git a/include/asm-generic/mm_hooks.h b/include/asm-generic/mm_hooks.h
index 4dbb177d1150..6eea3b3c1e65 100644
--- a/include/asm-generic/mm_hooks.h
+++ b/include/asm-generic/mm_hooks.h
@@ -1,8 +1,8 @@
/* SPDX-License-Identifier: GPL-2.0 */
/*
- * Define generic no-op hooks for arch_dup_mmap, arch_exit_mmap
- * and arch_unmap to be included in asm-FOO/mmu_context.h for any
- * arch FOO which doesn't need to hook these.
+ * Define generic no-op hooks for arch_dup_mmap and arch_exit_mmap
+ * to be included in asm-FOO/mmu_context.h for any arch FOO which
+ * doesn't need to hook these.
*/
#ifndef _ASM_GENERIC_MM_HOOKS_H
#define _ASM_GENERIC_MM_HOOKS_H
@@ -17,11 +17,6 @@ static inline void arch_exit_mmap(struct mm_struct *mm)
{
}
-static inline void arch_unmap(struct mm_struct *mm,
- unsigned long start, unsigned long end)
-{
-}
-
static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
bool write, bool execute, bool foreign)
{
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 485424979254..ef32d87a3adc 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -1313,6 +1313,8 @@ struct vm_special_mapping {
int (*mremap)(const struct vm_special_mapping *sm,
struct vm_area_struct *new_vma);
+ void (*close)(const struct vm_special_mapping *sm,
+ struct vm_area_struct *vma);
};
enum tlb_flush_reason {
diff --git a/mm/mmap.c b/mm/mmap.c
index d0dfc85b209b..adaaf1ef197a 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2789,7 +2789,7 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
*
* This function takes a @mas that is either pointing to the previous VMA or set
* to MA_START and sets it up to remove the mapping(s). The @len will be
- * aligned and any arch_unmap work will be preformed.
+ * aligned.
*
* Return: 0 on success and drops the lock if so directed, error and leaves the
* lock held otherwise.
@@ -2809,16 +2809,12 @@ int do_vmi_munmap(struct vma_iterator *vmi, struct mm_struct *mm,
return -EINVAL;
/*
- * Check if memory is sealed before arch_unmap.
- * Prevent unmapping a sealed VMA.
+ * Check if memory is sealed, prevent unmapping a sealed VMA.
* can_modify_mm assumes we have acquired the lock on MM.
*/
if (unlikely(!can_modify_mm(mm, start, end)))
return -EPERM;
- /* arch_unmap() might do unmaps itself. */
- arch_unmap(mm, start, end);
-
/* Find the first overlapping VMA */
vma = vma_find(vmi, end);
if (!vma) {
@@ -3232,14 +3228,12 @@ int do_vma_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
struct mm_struct *mm = vma->vm_mm;
/*
- * Check if memory is sealed before arch_unmap.
- * Prevent unmapping a sealed VMA.
+ * Check if memory is sealed, prevent unmapping a sealed VMA.
* can_modify_mm assumes we have acquired the lock on MM.
*/
if (unlikely(!can_modify_mm(mm, start, end)))
return -EPERM;
- arch_unmap(mm, start, end);
return do_vmi_align_munmap(vmi, vma, mm, start, end, uf, unlock);
}
@@ -3624,6 +3618,9 @@ static vm_fault_t special_mapping_fault(struct vm_fault *vmf);
*/
static void special_mapping_close(struct vm_area_struct *vma)
{
+ const struct vm_special_mapping *sm = vma->vm_private_data;
+ if (sm->close)
+ sm->close(sm, vma);
}
static const char *special_mapping_name(struct vm_area_struct *vma)
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression
2024-08-05 18:55 ` Linus Torvalds
@ 2024-08-05 19:33 ` Linus Torvalds
2024-08-06 2:14 ` Michael Ellerman
2024-08-06 6:04 ` Oliver Sang
2024-08-05 19:37 ` Jeff Xu
1 sibling, 2 replies; 29+ messages in thread
From: Linus Torvalds @ 2024-08-05 19:33 UTC (permalink / raw)
To: Jeff Xu, Michael Ellerman, Nicholas Piggin, Christophe Leroy
Cc: Pedro Falcato, kernel test robot, Jeff Xu, oe-lkp, lkp,
linux-kernel, Andrew Morton, Kees Cook, Liam R. Howlett,
Dave Hansen, Greg Kroah-Hartman, Guenter Roeck, Jann Horn,
Jonathan Corbet, Jorge Lucangeli Obes, Matthew Wilcox,
Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan,
Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api,
linux-mm, ying.huang, feng.tang, fengwei.yin
[-- Attachment #1: Type: text/plain, Size: 601 bytes --]
On Mon, 5 Aug 2024 at 11:55, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> So please consider this a "maybe something like this" patch, but that
> 'arch_unmap()' really is pretty nasty
Actually, the whole powerpc vdso code confused me. It's not the vvar
thing that wants this close thing, it's the other ones that have the
remap thing.
.. and there were two of those error cases that needed to reset the
vdso pointer.
That all shows just how carefully I was reading this code.
New version - still untested, but now I've read through it one more
time - attached.
Linus
[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 6923 bytes --]
arch/powerpc/include/asm/mmu_context.h | 9 ---------
arch/powerpc/kernel/vdso.c | 17 +++++++++++++++--
arch/x86/include/asm/mmu_context.h | 5 -----
include/asm-generic/mm_hooks.h | 11 +++--------
include/linux/mm_types.h | 2 ++
mm/mmap.c | 15 ++++++---------
6 files changed, 26 insertions(+), 33 deletions(-)
diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index 37bffa0f7918..a334a1368848 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -260,15 +260,6 @@ static inline void enter_lazy_tlb(struct mm_struct *mm,
extern void arch_exit_mmap(struct mm_struct *mm);
-static inline void arch_unmap(struct mm_struct *mm,
- unsigned long start, unsigned long end)
-{
- unsigned long vdso_base = (unsigned long)mm->context.vdso;
-
- if (start <= vdso_base && vdso_base < end)
- mm->context.vdso = NULL;
-}
-
#ifdef CONFIG_PPC_MEM_KEYS
bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write,
bool execute, bool foreign);
diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c
index 7a2ff9010f17..6fa041a6690a 100644
--- a/arch/powerpc/kernel/vdso.c
+++ b/arch/powerpc/kernel/vdso.c
@@ -81,6 +81,13 @@ static int vdso64_mremap(const struct vm_special_mapping *sm, struct vm_area_str
return vdso_mremap(sm, new_vma, &vdso64_end - &vdso64_start);
}
+static int vvar_close(const struct vm_special_mapping *sm,
+ struct vm_area_struct *vma)
+{
+ struct mm_struct *mm = vma->vm_mm;
+ mm->context.vdso = NULL;
+}
+
static vm_fault_t vvar_fault(const struct vm_special_mapping *sm,
struct vm_area_struct *vma, struct vm_fault *vmf);
@@ -92,11 +99,13 @@ static struct vm_special_mapping vvar_spec __ro_after_init = {
static struct vm_special_mapping vdso32_spec __ro_after_init = {
.name = "[vdso]",
.mremap = vdso32_mremap,
+ .close = vvar_close,
};
static struct vm_special_mapping vdso64_spec __ro_after_init = {
.name = "[vdso]",
.mremap = vdso64_mremap,
+ .close = vvar_close,
};
#ifdef CONFIG_TIME_NS
@@ -207,8 +216,10 @@ static int __arch_setup_additional_pages(struct linux_binprm *bprm, int uses_int
vma = _install_special_mapping(mm, vdso_base, vvar_size,
VM_READ | VM_MAYREAD | VM_IO |
VM_DONTDUMP | VM_PFNMAP, &vvar_spec);
- if (IS_ERR(vma))
+ if (IS_ERR(vma)) {
+ mm->context.vdso = NULL;
return PTR_ERR(vma);
+ }
/*
* our vma flags don't have VM_WRITE so by default, the process isn't
@@ -223,8 +234,10 @@ static int __arch_setup_additional_pages(struct linux_binprm *bprm, int uses_int
vma = _install_special_mapping(mm, vdso_base + vvar_size, vdso_size,
VM_READ | VM_EXEC | VM_MAYREAD |
VM_MAYWRITE | VM_MAYEXEC, vdso_spec);
- if (IS_ERR(vma))
+ if (IS_ERR(vma)) {
+ mm->context.vdso = NULL;
do_munmap(mm, vdso_base, vvar_size, NULL);
+ }
return PTR_ERR_OR_ZERO(vma);
}
diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
index 8dac45a2c7fc..80f2a3187aa6 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -232,11 +232,6 @@ static inline bool is_64bit_mm(struct mm_struct *mm)
}
#endif
-static inline void arch_unmap(struct mm_struct *mm, unsigned long start,
- unsigned long end)
-{
-}
-
/*
* We only want to enforce protection keys on the current process
* because we effectively have no access to PKRU for other
diff --git a/include/asm-generic/mm_hooks.h b/include/asm-generic/mm_hooks.h
index 4dbb177d1150..6eea3b3c1e65 100644
--- a/include/asm-generic/mm_hooks.h
+++ b/include/asm-generic/mm_hooks.h
@@ -1,8 +1,8 @@
/* SPDX-License-Identifier: GPL-2.0 */
/*
- * Define generic no-op hooks for arch_dup_mmap, arch_exit_mmap
- * and arch_unmap to be included in asm-FOO/mmu_context.h for any
- * arch FOO which doesn't need to hook these.
+ * Define generic no-op hooks for arch_dup_mmap and arch_exit_mmap
+ * to be included in asm-FOO/mmu_context.h for any arch FOO which
+ * doesn't need to hook these.
*/
#ifndef _ASM_GENERIC_MM_HOOKS_H
#define _ASM_GENERIC_MM_HOOKS_H
@@ -17,11 +17,6 @@ static inline void arch_exit_mmap(struct mm_struct *mm)
{
}
-static inline void arch_unmap(struct mm_struct *mm,
- unsigned long start, unsigned long end)
-{
-}
-
static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
bool write, bool execute, bool foreign)
{
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 485424979254..ef32d87a3adc 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -1313,6 +1313,8 @@ struct vm_special_mapping {
int (*mremap)(const struct vm_special_mapping *sm,
struct vm_area_struct *new_vma);
+ void (*close)(const struct vm_special_mapping *sm,
+ struct vm_area_struct *vma);
};
enum tlb_flush_reason {
diff --git a/mm/mmap.c b/mm/mmap.c
index d0dfc85b209b..adaaf1ef197a 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2789,7 +2789,7 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
*
* This function takes a @mas that is either pointing to the previous VMA or set
* to MA_START and sets it up to remove the mapping(s). The @len will be
- * aligned and any arch_unmap work will be preformed.
+ * aligned.
*
* Return: 0 on success and drops the lock if so directed, error and leaves the
* lock held otherwise.
@@ -2809,16 +2809,12 @@ int do_vmi_munmap(struct vma_iterator *vmi, struct mm_struct *mm,
return -EINVAL;
/*
- * Check if memory is sealed before arch_unmap.
- * Prevent unmapping a sealed VMA.
+ * Check if memory is sealed, prevent unmapping a sealed VMA.
* can_modify_mm assumes we have acquired the lock on MM.
*/
if (unlikely(!can_modify_mm(mm, start, end)))
return -EPERM;
- /* arch_unmap() might do unmaps itself. */
- arch_unmap(mm, start, end);
-
/* Find the first overlapping VMA */
vma = vma_find(vmi, end);
if (!vma) {
@@ -3232,14 +3228,12 @@ int do_vma_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
struct mm_struct *mm = vma->vm_mm;
/*
- * Check if memory is sealed before arch_unmap.
- * Prevent unmapping a sealed VMA.
+ * Check if memory is sealed, prevent unmapping a sealed VMA.
* can_modify_mm assumes we have acquired the lock on MM.
*/
if (unlikely(!can_modify_mm(mm, start, end)))
return -EPERM;
- arch_unmap(mm, start, end);
return do_vmi_align_munmap(vmi, vma, mm, start, end, uf, unlock);
}
@@ -3624,6 +3618,9 @@ static vm_fault_t special_mapping_fault(struct vm_fault *vmf);
*/
static void special_mapping_close(struct vm_area_struct *vma)
{
+ const struct vm_special_mapping *sm = vma->vm_private_data;
+ if (sm->close)
+ sm->close(sm, vma);
}
static const char *special_mapping_name(struct vm_area_struct *vma)
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression
2024-08-05 18:55 ` Linus Torvalds
2024-08-05 19:33 ` Linus Torvalds
@ 2024-08-05 19:37 ` Jeff Xu
2024-08-05 19:48 ` Linus Torvalds
1 sibling, 1 reply; 29+ messages in thread
From: Jeff Xu @ 2024-08-05 19:37 UTC (permalink / raw)
To: Linus Torvalds
Cc: Michael Ellerman, Nicholas Piggin, Christophe Leroy,
Pedro Falcato, kernel test robot, Jeff Xu, oe-lkp, lkp,
linux-kernel, Andrew Morton, Kees Cook, Liam R. Howlett,
Dave Hansen, Greg Kroah-Hartman, Guenter Roeck, Jann Horn,
Jonathan Corbet, Jorge Lucangeli Obes, Matthew Wilcox,
Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan,
Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api,
linux-mm, ying.huang, feng.tang, fengwei.yin
On Mon, Aug 5, 2024 at 12:01 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Mon, 5 Aug 2024 at 11:11, Jeff Xu <jeffxu@google.com> wrote:
> >
> > One thing that you can't walk around is that can_modify_mm must be
> > called prior to arch_unmap, that means in-place check for the munmap
> > is not possible.
>
> Actually, we should move 'arch_unmap()'.
>
I think you meant "remove"
> There is only one user of it, and it's pretty pointless.
>
> (Ok, there are two users - x86 also has an 'arch_unmap()', but it's empty).
>
> The reason I say that the current user of arch_unmap() is pointless is
> because this is what the powerpc user does:
>
> static inline void arch_unmap(struct mm_struct *mm,
> unsigned long start, unsigned long end)
> {
> unsigned long vdso_base = (unsigned long)mm->context.vdso;
>
> if (start <= vdso_base && vdso_base < end)
> mm->context.vdso = NULL;
> }
>
> and that would make sense if we didn't have an actual 'vma' that
> matched the vdso. But we do.
>
> I think this code may predate the whole "create a vma for the vdso"
> code. Or maybe it was just always confused.
>
Agree it is best to remove.
> Anyway, what the code *should* do is that we should just have a
> ->close() function for special mappings, and call that in
> special_mapping_close().
>
I'm curious, why does ppc need to unmap vdso ? ( other archs don't
have unmap logic.)
vdso has .remap, iiuc, that is for CHECKPOINT_RESTORE feature, i.e.
during restore, vdso might get relocated after taking from dump. [1]
IIUC, vdso mapping doesn't change during the lifetime of the process.
Or does it in some user cases ?
[1] https://lore.kernel.org/linux-mm/20161101172214.2938-1-dsafonov@virtuozzo.com/
> This is an ENTIRELY UNTESTED patch that gets rid of this horrendous wart.
>
> Michael / Nick / Christophe? Note that I didn't even compile-test this
> on x86-64, much less on powerpc.
>
> So please consider this a "maybe something like this" patch, but that
> 'arch_unmap()' really is pretty nasty.
>
> Oh, and there was a bug in the error path of the powerpc vdso setup
> code anyway. The patch fixes that too, although considering the
> entirely untested nature of it, the "fixes" is laughably optimistic.
>
> Linus
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression
2024-08-05 19:37 ` Jeff Xu
@ 2024-08-05 19:48 ` Linus Torvalds
2024-08-05 19:50 ` Linus Torvalds
2024-08-05 23:24 ` Nicholas Piggin
0 siblings, 2 replies; 29+ messages in thread
From: Linus Torvalds @ 2024-08-05 19:48 UTC (permalink / raw)
To: Jeff Xu
Cc: Michael Ellerman, Nicholas Piggin, Christophe Leroy,
Pedro Falcato, kernel test robot, Jeff Xu, oe-lkp, lkp,
linux-kernel, Andrew Morton, Kees Cook, Liam R. Howlett,
Dave Hansen, Greg Kroah-Hartman, Guenter Roeck, Jann Horn,
Jonathan Corbet, Jorge Lucangeli Obes, Matthew Wilcox,
Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan,
Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api,
linux-mm, ying.huang, feng.tang, fengwei.yin
On Mon, 5 Aug 2024 at 12:38, Jeff Xu <jeffxu@google.com> wrote:
>
> I'm curious, why does ppc need to unmap vdso ? ( other archs don't
> have unmap logic.)
I have no idea. There are comments about 'perf' getting confused about
mmap counts when 'context.vdso' isn't set up.
But x86 has the same context.vdso logic, and does *not* set the
pointer before installing the vma, for example. Also does not zero it
out on munmap(), although it does have the mremap logic.
For all I know it may all be entirely unnecessary, and could be
removed entirely.
Linus
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression
2024-08-05 19:48 ` Linus Torvalds
@ 2024-08-05 19:50 ` Linus Torvalds
2024-08-05 23:24 ` Nicholas Piggin
1 sibling, 0 replies; 29+ messages in thread
From: Linus Torvalds @ 2024-08-05 19:50 UTC (permalink / raw)
To: Jeff Xu
Cc: Michael Ellerman, Nicholas Piggin, Christophe Leroy,
Pedro Falcato, kernel test robot, Jeff Xu, oe-lkp, lkp,
linux-kernel, Andrew Morton, Kees Cook, Liam R. Howlett,
Dave Hansen, Greg Kroah-Hartman, Guenter Roeck, Jann Horn,
Jonathan Corbet, Jorge Lucangeli Obes, Matthew Wilcox,
Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan,
Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api,
linux-mm, ying.huang, feng.tang, fengwei.yin
On Mon, 5 Aug 2024 at 12:48, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> But x86 has the same context.vdso logic, and does *not* set the
> pointer before installing the vma, for example. Also does not zero it
> out on munmap(), although it does have the mremap logic.
Oh, and the empty stale arch_unmap() code on the x86 side has never
been about the vdso thing, it was about some horrid MPX notification
that no longer exists.
In case people wonder like I did.
Linus
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression
2024-08-05 19:48 ` Linus Torvalds
2024-08-05 19:50 ` Linus Torvalds
@ 2024-08-05 23:24 ` Nicholas Piggin
2024-08-06 0:13 ` Linus Torvalds
1 sibling, 1 reply; 29+ messages in thread
From: Nicholas Piggin @ 2024-08-05 23:24 UTC (permalink / raw)
To: Linus Torvalds, Jeff Xu
Cc: Michael Ellerman, Christophe Leroy, Pedro Falcato,
kernel test robot, Jeff Xu, oe-lkp, lkp, linux-kernel,
Andrew Morton, Kees Cook, Liam R. Howlett, Dave Hansen,
Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jonathan Corbet,
Jorge Lucangeli Obes, Matthew Wilcox, Muhammad Usama Anjum,
Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany,
Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang,
feng.tang, fengwei.yin
On Tue Aug 6, 2024 at 5:48 AM AEST, Linus Torvalds wrote:
> On Mon, 5 Aug 2024 at 12:38, Jeff Xu <jeffxu@google.com> wrote:
> >
> > I'm curious, why does ppc need to unmap vdso ? ( other archs don't
> > have unmap logic.)
>
> I have no idea. There are comments about 'perf' getting confused about
> mmap counts when 'context.vdso' isn't set up.
>
> But x86 has the same context.vdso logic, and does *not* set the
> pointer before installing the vma, for example. Also does not zero it
> out on munmap(), although it does have the mremap logic.
>
> For all I know it may all be entirely unnecessary, and could be
> removed entirely.
I don't know much about vdso code, it predated my involvedment in ppc.
Commit 83d3f0e90c6c8 says CRIU (checkpoint restore in userspace) is
moving it around. Why CRIU wants to do that, I don't know.
Can userspace on other archs not unmap their vdsos?
Thanks,
Nick
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression
2024-08-05 23:24 ` Nicholas Piggin
@ 2024-08-06 0:13 ` Linus Torvalds
2024-08-06 1:22 ` Jeff Xu
2024-08-06 2:01 ` Michael Ellerman
0 siblings, 2 replies; 29+ messages in thread
From: Linus Torvalds @ 2024-08-06 0:13 UTC (permalink / raw)
To: Nicholas Piggin
Cc: Jeff Xu, Michael Ellerman, Christophe Leroy, Pedro Falcato,
kernel test robot, Jeff Xu, oe-lkp, lkp, linux-kernel,
Andrew Morton, Kees Cook, Liam R. Howlett, Dave Hansen,
Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jonathan Corbet,
Jorge Lucangeli Obes, Matthew Wilcox, Muhammad Usama Anjum,
Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany,
Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang,
feng.tang, fengwei.yin
On Mon, 5 Aug 2024 at 16:25, Nicholas Piggin <npiggin@gmail.com> wrote:
>
> Can userspace on other archs not unmap their vdsos?
I think they can, and nobody cares. The "context.vdso" value stays at
some stale value, and anybody who tries to use it will just fail.
So what makes powerpc special is not "you can unmap the vdso", but
"powerpc cares".
I just don't quite know _why_ powerpc cares.
Judging by the comments and a quick 'grep', the reason may be
arch/powerpc/perf/callchain_32.c
which seems to have some vdso knowledge.
But x86 does something kind of like that at signal frame generation
time, and doesn't care.
I really think it's an issue of "if you screw with the vdso, you get
to keep both broken pieces".
Linus
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression
2024-08-06 0:13 ` Linus Torvalds
@ 2024-08-06 1:22 ` Jeff Xu
2024-08-06 2:01 ` Michael Ellerman
1 sibling, 0 replies; 29+ messages in thread
From: Jeff Xu @ 2024-08-06 1:22 UTC (permalink / raw)
To: Linus Torvalds
Cc: Nicholas Piggin, Michael Ellerman, Christophe Leroy,
Pedro Falcato, kernel test robot, Jeff Xu, oe-lkp, lkp,
linux-kernel, Andrew Morton, Kees Cook, Liam R. Howlett,
Dave Hansen, Greg Kroah-Hartman, Guenter Roeck, Jann Horn,
Jonathan Corbet, Jorge Lucangeli Obes, Matthew Wilcox,
Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan,
Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api,
linux-mm, ying.huang, feng.tang, fengwei.yin
On Mon, Aug 5, 2024 at 5:13 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Mon, 5 Aug 2024 at 16:25, Nicholas Piggin <npiggin@gmail.com> wrote:
> >
> > Can userspace on other archs not unmap their vdsos?
>
> I think they can, and nobody cares. The "context.vdso" value stays at
> some stale value, and anybody who tries to use it will just fail.
>
I want to seal the vdso :-), so I also care (not having it changeable
from userspace)
For the restore scenario, if vdso is sealed, I guess CRIU won't be
able to relocate the vdso from userspace, I 'm interested in hearing
vdso dev's input on this , e.g. is that possible to make CRIU
compatible with memory sealing.
> So what makes powerpc special is not "you can unmap the vdso", but
> "powerpc cares".
>
> I just don't quite know _why_ powerpc cares.
>
> Judging by the comments and a quick 'grep', the reason may be
>
> arch/powerpc/perf/callchain_32.c
>
> which seems to have some vdso knowledge.
>
> But x86 does something kind of like that at signal frame generation
> time, and doesn't care.
>
> I really think it's an issue of "if you screw with the vdso, you get
> to keep both broken pieces".
>
> Linus
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression
2024-08-05 16:58 ` Jeff Xu
@ 2024-08-06 1:44 ` Oliver Sang
2024-08-06 14:54 ` Jeff Xu
0 siblings, 1 reply; 29+ messages in thread
From: Oliver Sang @ 2024-08-06 1:44 UTC (permalink / raw)
To: Jeff Xu
Cc: oe-lkp, lkp, linux-kernel, Andrew Morton, Kees Cook,
Liam R. Howlett, Pedro Falcato, Dave Hansen, Greg Kroah-Hartman,
Guenter Roeck, Jann Horn, Jeff Xu, Jonathan Corbet,
Jorge Lucangeli Obes, Linus Torvalds, Matthew Wilcox,
Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan,
Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api,
linux-mm, ying.huang, feng.tang, fengwei.yin, oliver.sang
hi, Jeff,
On Mon, Aug 05, 2024 at 09:58:33AM -0700, Jeff Xu wrote:
> On Sun, Aug 4, 2024 at 1:59 AM kernel test robot <oliver.sang@intel.com> wrote:
> >
> >
> >
> > Hello,
> >
> > kernel test robot noticed a -4.4% regression of stress-ng.pagemove.page_remaps_per_sec on:
> >
> >
> > commit: 8be7258aad44b5e25977a98db136f677fa6f4370 ("mseal: add mseal syscall")
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> >
> > testcase: stress-ng
> > test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
> > parameters:
> >
> > nr_threads: 100%
> > testtime: 60s
> > test: pagemove
> > cpufreq_governor: performance
> >
> >
> > In addition to that, the commit also has significant impact on the following tests:
> >
> > +------------------+---------------------------------------------------------------------------------------------+
> > | testcase: change | stress-ng: stress-ng.pkey.ops_per_sec -3.6% regression |
> > | test machine | 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory |
> > | test parameters | cpufreq_governor=performance |
> > | | nr_threads=100% |
> > | | test=pkey |
> > | | testtime=60s |
> > +------------------+---------------------------------------------------------------------------------------------+
> >
> >
> > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > the same patch/commit), kindly add following tags
> > | Reported-by: kernel test robot <oliver.sang@intel.com>
> > | Closes: https://lore.kernel.org/oe-lkp/202408041602.caa0372-oliver.sang@intel.com
> >
> >
> > Details are as below:
> > -------------------------------------------------------------------------------------------------->
> >
> >
> > The kernel config and materials to reproduce are available at:
> > https://download.01.org/0day-ci/archive/20240804/202408041602.caa0372-oliver.sang@intel.com
> >
> There is an error when I try to reproduce the test:
what's your os? we support some distributions
https://github.com/intel/lkp-tests?tab=readme-ov-file#supported-distributions
>
> bin/lkp install job.yaml
>
> --------------------------------------------------------
> Some packages could not be installed. This may mean that you have
> requested an impossible situation or if you are using the unstable
> distribution that some required packages have not yet been created
> or been moved out of Incoming.
> The following information may help to resolve the situation:
>
> The following packages have unmet dependencies:
> libdw1 : Depends: libelf1 (= 0.190-1+b1)
> libdw1t64 : Breaks: libdw1 (< 0.191-2)
> E: Unable to correct problems, you have held broken packages.
> Cannot install some packages of perf-c2c depends
> -----------------------------------------------------------------------------------------
>
> And where is stress-ng.pagemove.page_remaps_per_sec test implemented,
> is that part of lkp-tests ?
stress-ng is in https://github.com/ColinIanKing/stress-ng
>
> Thanks
> -Jeff
>
> > =========================================================================================
> > compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
> > gcc-13/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/pagemove/stress-ng/60s
> >
> > commit:
> > ff388fe5c4 ("mseal: wire up mseal syscall")
> > 8be7258aad ("mseal: add mseal syscall")
> >
> > ff388fe5c481d39c 8be7258aad44b5e25977a98db13
> > ---------------- ---------------------------
> > %stddev %change %stddev
> > \ | \
> > 41625945 -4.3% 39842322 proc-vmstat.numa_hit
> > 41559175 -4.3% 39774160 proc-vmstat.numa_local
> > 77484314 -4.4% 74105555 proc-vmstat.pgalloc_normal
> > 77205752 -4.4% 73826672 proc-vmstat.pgfree
> > 18361466 -4.2% 17596652 stress-ng.pagemove.ops
> > 306014 -4.2% 293262 stress-ng.pagemove.ops_per_sec
> > 205312 -4.4% 196176 stress-ng.pagemove.page_remaps_per_sec
> > 4961 +1.0% 5013 stress-ng.time.percent_of_cpu_this_job_got
> > 2917 +1.2% 2952 stress-ng.time.system_time
> > 1.07 -6.6% 1.00 perf-stat.i.MPKI
> > 3.354e+10 +3.5% 3.473e+10 perf-stat.i.branch-instructions
> > 1.795e+08 -4.2% 1.719e+08 perf-stat.i.cache-misses
> > 2.376e+08 -4.1% 2.279e+08 perf-stat.i.cache-references
> > 1.13 -3.0% 1.10 perf-stat.i.cpi
> > 1077 +4.3% 1124 perf-stat.i.cycles-between-cache-misses
> > 1.717e+11 +2.7% 1.762e+11 perf-stat.i.instructions
> > 0.88 +3.1% 0.91 perf-stat.i.ipc
> > 1.05 -6.8% 0.97 perf-stat.overall.MPKI
> > 0.25 ą 2% -0.0 0.24 perf-stat.overall.branch-miss-rate%
> > 1.13 -3.0% 1.10 perf-stat.overall.cpi
> > 1084 +4.0% 1127 perf-stat.overall.cycles-between-cache-misses
> > 0.88 +3.1% 0.91 perf-stat.overall.ipc
> > 3.298e+10 +3.5% 3.415e+10 perf-stat.ps.branch-instructions
> > 1.764e+08 -4.3% 1.689e+08 perf-stat.ps.cache-misses
> > 2.336e+08 -4.1% 2.24e+08 perf-stat.ps.cache-references
> > 194.57 -2.4% 189.96 ą 2% perf-stat.ps.cpu-migrations
> > 1.688e+11 +2.7% 1.733e+11 perf-stat.ps.instructions
> > 1.036e+13 +3.0% 1.068e+13 perf-stat.total.instructions
> > 75.12 -1.9 73.22 perf-profile.calltrace.cycles-pp.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
> > 36.84 -1.6 35.29 perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64
> > 24.90 -1.2 23.72 perf-profile.calltrace.cycles-pp.copy_vma.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 19.89 -0.9 18.98 perf-profile.calltrace.cycles-pp.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
> > 10.56 ą 2% -0.8 9.78 ą 2% perf-profile.calltrace.cycles-pp.rcu_core.handle_softirqs.run_ksoftirqd.smpboot_thread_fn.kthread
> > 10.56 ą 2% -0.8 9.79 ą 2% perf-profile.calltrace.cycles-pp.handle_softirqs.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork
> > 10.56 ą 2% -0.8 9.79 ą 2% perf-profile.calltrace.cycles-pp.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
> > 10.57 ą 2% -0.8 9.80 ą 2% perf-profile.calltrace.cycles-pp.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
> > 10.52 ą 2% -0.8 9.75 ą 2% perf-profile.calltrace.cycles-pp.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd.smpboot_thread_fn
> > 10.62 ą 2% -0.8 9.85 ą 2% perf-profile.calltrace.cycles-pp.kthread.ret_from_fork.ret_from_fork_asm
> > 10.62 ą 2% -0.8 9.85 ą 2% perf-profile.calltrace.cycles-pp.ret_from_fork.ret_from_fork_asm
> > 10.62 ą 2% -0.8 9.85 ą 2% perf-profile.calltrace.cycles-pp.ret_from_fork_asm
> > 14.75 -0.7 14.07 perf-profile.calltrace.cycles-pp.vma_merge.copy_vma.move_vma.__do_sys_mremap.do_syscall_64
> > 1.50 -0.6 0.94 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64
> > 5.88 ą 2% -0.4 5.47 ą 2% perf-profile.calltrace.cycles-pp.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd
> > 7.80 -0.3 7.47 perf-profile.calltrace.cycles-pp.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 4.55 ą 2% -0.3 4.24 ą 2% perf-profile.calltrace.cycles-pp.__slab_free.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs
> > 6.76 -0.3 6.45 perf-profile.calltrace.cycles-pp.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma
> > 6.15 -0.3 5.86 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap
> > 8.22 -0.3 7.93 perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
> > 6.12 -0.3 5.87 perf-profile.calltrace.cycles-pp.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma
> > 5.74 -0.2 5.50 perf-profile.calltrace.cycles-pp.move_ptes.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64
> > 3.16 ą 2% -0.2 2.94 perf-profile.calltrace.cycles-pp.vm_area_free_rcu_cb.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd
> > 5.50 -0.2 5.28 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap
> > 1.36 -0.2 1.14 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap
> > 5.15 -0.2 4.94 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap
> > 5.51 -0.2 5.31 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_merge.copy_vma.move_vma.__do_sys_mremap
> > 5.16 -0.2 4.97 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma.move_vma
> > 2.24 -0.2 2.05 perf-profile.calltrace.cycles-pp.find_vma_prev.copy_vma.move_vma.__do_sys_mremap.do_syscall_64
> > 2.60 ą 2% -0.2 2.42 ą 2% perf-profile.calltrace.cycles-pp.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core.handle_softirqs
> > 4.67 -0.2 4.49 perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma
> > 3.41 -0.2 3.23 perf-profile.calltrace.cycles-pp.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma
> > 3.00 -0.2 2.83 ą 2% perf-profile.calltrace.cycles-pp.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma
> > 0.96 -0.2 0.80 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.do_munmap.mremap_to
> > 4.04 -0.2 3.88 perf-profile.calltrace.cycles-pp.vm_area_dup.copy_vma.move_vma.__do_sys_mremap.do_syscall_64
> > 3.20 ą 2% -0.2 3.04 ą 2% perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap
> > 3.53 -0.1 3.38 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma.__do_sys_mremap
> > 3.40 -0.1 3.26 perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma.__do_sys_mremap
> > 2.20 ą 2% -0.1 2.06 ą 2% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap
> > 1.84 ą 3% -0.1 1.71 ą 3% perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.anon_vma_clone.__split_vma.do_vmi_align_munmap
> > 1.78 ą 2% -0.1 1.65 ą 3% perf-profile.calltrace.cycles-pp.vma_prepare.vma_merge.copy_vma.move_vma.__do_sys_mremap
> > 2.69 -0.1 2.56 perf-profile.calltrace.cycles-pp.unlink_anon_vmas.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap
> > 1.78 ą 2% -0.1 1.66 ą 2% perf-profile.calltrace.cycles-pp.__slab_free.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core
> > 1.36 ą 2% -0.1 1.23 ą 2% perf-profile.calltrace.cycles-pp.rcu_cblist_dequeue.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd
> > 0.95 -0.1 0.83 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap
> > 3.29 -0.1 3.17 perf-profile.calltrace.cycles-pp.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
> > 2.08 -0.1 1.96 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_merge.copy_vma.move_vma.__do_sys_mremap
> > 1.43 ą 3% -0.1 1.32 ą 3% perf-profile.calltrace.cycles-pp.down_write.vma_prepare.vma_merge.copy_vma.move_vma
> > 2.21 -0.1 2.10 perf-profile.calltrace.cycles-pp.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma
> > 2.47 -0.1 2.36 perf-profile.calltrace.cycles-pp.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma
> > 2.21 -0.1 2.12 perf-profile.calltrace.cycles-pp.native_flush_tlb_one_user.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables
> > 1.41 -0.1 1.32 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap
> > 1.26 -0.1 1.18 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap
> > 1.82 -0.1 1.75 perf-profile.calltrace.cycles-pp.vma_link.copy_vma.move_vma.__do_sys_mremap.do_syscall_64
> > 0.71 -0.1 0.63 perf-profile.calltrace.cycles-pp.mas_prev.vma_merge.copy_vma.move_vma.__do_sys_mremap
> > 1.29 -0.1 1.22 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma
> > 0.61 -0.1 0.54 perf-profile.calltrace.cycles-pp.mas_prev_setup.mas_prev.vma_merge.copy_vma.move_vma
> > 1.36 -0.1 1.29 perf-profile.calltrace.cycles-pp.kmem_cache_free.unlink_anon_vmas.free_pgtables.unmap_region.do_vmi_align_munmap
> > 1.40 -0.1 1.33 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma.move_vma
> > 0.70 -0.1 0.64 perf-profile.calltrace.cycles-pp.mas_walk.find_vma_prev.copy_vma.move_vma.__do_sys_mremap
> > 1.23 -0.1 1.17 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma
> > 1.66 -0.1 1.60 perf-profile.calltrace.cycles-pp.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 1.16 -0.1 1.10 perf-profile.calltrace.cycles-pp.anon_vma_clone.copy_vma.move_vma.__do_sys_mremap.do_syscall_64
> > 0.96 -0.1 0.90 perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.unlink_anon_vmas.free_pgtables.unmap_region
> > 1.14 -0.1 1.08 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap
> > 0.79 -0.1 0.74 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma
> > 1.04 -0.1 1.00 perf-profile.calltrace.cycles-pp.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 0.58 -0.0 0.53 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.mremap
> > 0.61 -0.0 0.56 perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core
> > 0.56 -0.0 0.52 perf-profile.calltrace.cycles-pp.mas_find.find_vma_prev.copy_vma.move_vma.__do_sys_mremap
> > 0.57 -0.0 0.53 ą 2% perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs
> > 0.78 -0.0 0.74 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge
> > 0.88 -0.0 0.84 perf-profile.calltrace.cycles-pp.mtree_load.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64
> > 0.70 -0.0 0.66 perf-profile.calltrace.cycles-pp.__call_rcu_common.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
> > 0.68 -0.0 0.64 perf-profile.calltrace.cycles-pp.percpu_counter_add_batch.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
> > 0.68 -0.0 0.64 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.mremap
> > 0.97 -0.0 0.93 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
> > 1.11 -0.0 1.08 perf-profile.calltrace.cycles-pp.clear_bhb_loop.mremap
> > 0.75 -0.0 0.72 perf-profile.calltrace.cycles-pp.allocate_slab.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma
> > 0.74 -0.0 0.71 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.anon_vma_clone.copy_vma.move_vma.__do_sys_mremap
> > 0.60 ą 2% -0.0 0.57 perf-profile.calltrace.cycles-pp.security_mmap_addr.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64
> > 0.67 ą 2% -0.0 0.64 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma
> > 0.82 -0.0 0.79 perf-profile.calltrace.cycles-pp.mtree_load.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
> > 0.63 -0.0 0.60 perf-profile.calltrace.cycles-pp.mas_prev_slot.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
> > 0.99 -0.0 0.96 perf-profile.calltrace.cycles-pp.mt_find.vma_merge.copy_vma.move_vma.__do_sys_mremap
> > 0.62 ą 2% -0.0 0.59 perf-profile.calltrace.cycles-pp.get_old_pud.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64
> > 0.87 -0.0 0.84 perf-profile.calltrace.cycles-pp.userfaultfd_unmap_complete.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
> > 0.78 -0.0 0.75 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_link.copy_vma.move_vma.__do_sys_mremap
> > 0.64 -0.0 0.62 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_link.copy_vma.move_vma.__do_sys_mremap
> > 0.90 -0.0 0.87 perf-profile.calltrace.cycles-pp.mtree_load.vma_merge.copy_vma.move_vma.__do_sys_mremap
> > 0.54 -0.0 0.52 perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.move_ptes.move_page_tables.move_vma.__do_sys_mremap
> > 1.04 +0.0 1.08 perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region
> > 0.76 +0.1 0.83 perf-profile.calltrace.cycles-pp.__madvise
> > 0.63 +0.1 0.70 perf-profile.calltrace.cycles-pp.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
> > 0.62 +0.1 0.70 perf-profile.calltrace.cycles-pp.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
> > 0.66 +0.1 0.74 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__madvise
> > 0.66 +0.1 0.74 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
> > 87.74 +0.7 88.45 perf-profile.calltrace.cycles-pp.mremap
> > 0.00 +0.9 0.86 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.do_munmap
> > 0.00 +0.9 0.86 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.mremap_to.__do_sys_mremap
> > 84.88 +0.9 85.77 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.mremap
> > 84.73 +0.9 85.62 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
> > 0.00 +0.9 0.92 ą 2% perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.move_vma
> > 83.84 +0.9 84.78 perf-profile.calltrace.cycles-pp.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
> > 0.00 +1.1 1.06 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64
> > 0.00 +1.2 1.21 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to
> > 2.07 +1.5 3.55 perf-profile.calltrace.cycles-pp.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 1.58 +1.5 3.07 perf-profile.calltrace.cycles-pp.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64
> > 0.00 +1.5 1.52 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap
> > 0.00 +1.6 1.57 perf-profile.calltrace.cycles-pp.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > 0.00 +1.7 1.72 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap
> > 0.00 +2.0 2.01 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64
> > 5.39 +2.9 8.32 perf-profile.calltrace.cycles-pp.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
> > 75.29 -1.9 73.37 perf-profile.children.cycles-pp.move_vma
> > 37.06 -1.6 35.50 perf-profile.children.cycles-pp.do_vmi_align_munmap
> > 24.98 -1.2 23.80 perf-profile.children.cycles-pp.copy_vma
> > 19.99 -1.0 19.02 perf-profile.children.cycles-pp.handle_softirqs
> > 19.97 -1.0 19.00 perf-profile.children.cycles-pp.rcu_core
> > 19.95 -1.0 18.98 perf-profile.children.cycles-pp.rcu_do_batch
> > 19.98 -0.9 19.06 perf-profile.children.cycles-pp.__split_vma
> > 17.55 -0.8 16.76 perf-profile.children.cycles-pp.kmem_cache_free
> > 10.56 ą 2% -0.8 9.79 ą 2% perf-profile.children.cycles-pp.run_ksoftirqd
> > 10.57 ą 2% -0.8 9.80 ą 2% perf-profile.children.cycles-pp.smpboot_thread_fn
> > 15.38 -0.8 14.62 perf-profile.children.cycles-pp.kmem_cache_alloc_noprof
> > 10.62 ą 2% -0.8 9.85 ą 2% perf-profile.children.cycles-pp.kthread
> > 10.62 ą 2% -0.8 9.86 ą 2% perf-profile.children.cycles-pp.ret_from_fork
> > 10.62 ą 2% -0.8 9.86 ą 2% perf-profile.children.cycles-pp.ret_from_fork_asm
> > 15.14 -0.7 14.44 perf-profile.children.cycles-pp.vma_merge
> > 12.08 -0.5 11.55 perf-profile.children.cycles-pp.__slab_free
> > 12.11 -0.5 11.62 perf-profile.children.cycles-pp.mas_wr_store_entry
> > 10.86 -0.5 10.39 perf-profile.children.cycles-pp.vm_area_dup
> > 11.89 -0.5 11.44 perf-profile.children.cycles-pp.mas_store_prealloc
> > 8.49 -0.4 8.06 perf-profile.children.cycles-pp.__memcg_slab_post_alloc_hook
> > 9.88 -0.4 9.49 perf-profile.children.cycles-pp.mas_wr_node_store
> > 7.91 -0.3 7.58 perf-profile.children.cycles-pp.move_page_tables
> > 6.06 -0.3 5.78 perf-profile.children.cycles-pp.vm_area_free_rcu_cb
> > 8.28 -0.3 8.00 perf-profile.children.cycles-pp.unmap_region
> > 6.69 -0.3 6.42 perf-profile.children.cycles-pp.vma_complete
> > 5.06 -0.3 4.80 perf-profile.children.cycles-pp.mas_preallocate
> > 5.82 -0.2 5.57 perf-profile.children.cycles-pp.move_ptes
> > 4.24 -0.2 4.01 perf-profile.children.cycles-pp.anon_vma_clone
> > 3.50 -0.2 3.30 perf-profile.children.cycles-pp.down_write
> > 2.44 -0.2 2.25 perf-profile.children.cycles-pp.find_vma_prev
> > 3.46 -0.2 3.28 perf-profile.children.cycles-pp.___slab_alloc
> > 3.45 -0.2 3.27 perf-profile.children.cycles-pp.free_pgtables
> > 2.54 -0.2 2.37 perf-profile.children.cycles-pp.rcu_cblist_dequeue
> > 3.35 -0.2 3.18 perf-profile.children.cycles-pp.__memcg_slab_free_hook
> > 2.93 -0.2 2.78 perf-profile.children.cycles-pp.mas_alloc_nodes
> > 2.28 ą 2% -0.2 2.12 ą 2% perf-profile.children.cycles-pp.vma_prepare
> > 3.46 -0.1 3.32 perf-profile.children.cycles-pp.flush_tlb_mm_range
> > 3.41 -0.1 3.27 ą 2% perf-profile.children.cycles-pp.mod_objcg_state
> > 2.76 -0.1 2.63 perf-profile.children.cycles-pp.unlink_anon_vmas
> > 3.41 -0.1 3.28 perf-profile.children.cycles-pp.mas_store_gfp
> > 2.21 -0.1 2.09 perf-profile.children.cycles-pp.__cond_resched
> > 2.04 -0.1 1.94 perf-profile.children.cycles-pp.allocate_slab
> > 2.10 -0.1 2.00 perf-profile.children.cycles-pp.__call_rcu_common
> > 2.51 -0.1 2.40 perf-profile.children.cycles-pp.flush_tlb_func
> > 1.04 -0.1 0.94 perf-profile.children.cycles-pp.mas_prev
> > 2.71 -0.1 2.61 perf-profile.children.cycles-pp.mtree_load
> > 2.23 -0.1 2.14 perf-profile.children.cycles-pp.native_flush_tlb_one_user
> > 0.22 ą 5% -0.1 0.13 ą 13% perf-profile.children.cycles-pp.vm_stat_account
> > 0.95 -0.1 0.87 perf-profile.children.cycles-pp.mas_prev_setup
> > 1.65 -0.1 1.57 perf-profile.children.cycles-pp.mas_wr_walk
> > 1.84 -0.1 1.76 perf-profile.children.cycles-pp.up_write
> > 1.27 -0.1 1.20 perf-profile.children.cycles-pp.mas_prev_slot
> > 1.84 -0.1 1.77 perf-profile.children.cycles-pp.vma_link
> > 1.39 -0.1 1.32 perf-profile.children.cycles-pp.shuffle_freelist
> > 0.96 -0.1 0.90 ą 2% perf-profile.children.cycles-pp.rcu_all_qs
> > 0.86 -0.1 0.80 perf-profile.children.cycles-pp._raw_spin_lock_irqsave
> > 1.70 -0.1 1.64 perf-profile.children.cycles-pp.__get_unmapped_area
> > 0.34 ą 3% -0.1 0.29 ą 5% perf-profile.children.cycles-pp.security_vm_enough_memory_mm
> > 0.60 -0.0 0.55 perf-profile.children.cycles-pp.entry_SYSCALL_64
> > 0.92 -0.0 0.87 perf-profile.children.cycles-pp.percpu_counter_add_batch
> > 1.07 -0.0 1.02 perf-profile.children.cycles-pp.vma_to_resize
> > 1.59 -0.0 1.54 perf-profile.children.cycles-pp.mas_update_gap
> > 0.44 ą 2% -0.0 0.40 ą 2% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
> > 0.70 -0.0 0.66 perf-profile.children.cycles-pp.syscall_return_via_sysret
> > 1.13 -0.0 1.09 perf-profile.children.cycles-pp.mt_find
> > 0.20 ą 6% -0.0 0.17 ą 9% perf-profile.children.cycles-pp.cap_vm_enough_memory
> > 0.99 -0.0 0.95 perf-profile.children.cycles-pp.mas_pop_node
> > 0.63 ą 2% -0.0 0.59 perf-profile.children.cycles-pp.security_mmap_addr
> > 0.62 -0.0 0.59 perf-profile.children.cycles-pp.__put_partials
> > 1.17 -0.0 1.14 perf-profile.children.cycles-pp.clear_bhb_loop
> > 0.46 -0.0 0.43 ą 2% perf-profile.children.cycles-pp.__alloc_pages_noprof
> > 0.44 -0.0 0.41 ą 2% perf-profile.children.cycles-pp.get_page_from_freelist
> > 0.90 -0.0 0.87 perf-profile.children.cycles-pp.userfaultfd_unmap_complete
> > 0.64 ą 2% -0.0 0.62 perf-profile.children.cycles-pp.get_old_pud
> > 1.07 -0.0 1.05 perf-profile.children.cycles-pp.mas_leaf_max_gap
> > 0.22 ą 3% -0.0 0.20 ą 2% perf-profile.children.cycles-pp.__rmqueue_pcplist
> > 0.55 -0.0 0.53 perf-profile.children.cycles-pp.refill_obj_stock
> > 0.25 -0.0 0.23 ą 3% perf-profile.children.cycles-pp.rmqueue
> > 0.48 -0.0 0.45 perf-profile.children.cycles-pp.mremap_userfaultfd_prep
> > 0.33 -0.0 0.30 perf-profile.children.cycles-pp.free_unref_page
> > 0.46 -0.0 0.44 perf-profile.children.cycles-pp.setup_object
> > 0.21 ą 3% -0.0 0.19 ą 2% perf-profile.children.cycles-pp.rmqueue_bulk
> > 0.31 ą 3% -0.0 0.29 perf-profile.children.cycles-pp.__vm_enough_memory
> > 0.40 -0.0 0.38 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
> > 0.36 -0.0 0.35 perf-profile.children.cycles-pp.madvise_vma_behavior
> > 0.54 -0.0 0.53 ą 2% perf-profile.children.cycles-pp.mas_wr_end_piv
> > 0.46 -0.0 0.44 ą 2% perf-profile.children.cycles-pp.rcu_segcblist_enqueue
> > 0.34 -0.0 0.32 ą 2% perf-profile.children.cycles-pp.mas_destroy
> > 0.28 -0.0 0.26 ą 3% perf-profile.children.cycles-pp.mas_wr_store_setup
> > 0.30 -0.0 0.28 perf-profile.children.cycles-pp.pte_offset_map_nolock
> > 0.19 -0.0 0.18 ą 2% perf-profile.children.cycles-pp.__thp_vma_allowable_orders
> > 0.08 ą 4% -0.0 0.07 perf-profile.children.cycles-pp.ksm_madvise
> > 0.17 -0.0 0.16 perf-profile.children.cycles-pp.get_any_partial
> > 0.08 -0.0 0.07 perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
> > 0.45 +0.0 0.47 perf-profile.children.cycles-pp._raw_spin_lock
> > 1.10 +0.0 1.14 perf-profile.children.cycles-pp.zap_pte_range
> > 0.78 +0.1 0.85 perf-profile.children.cycles-pp.__madvise
> > 0.63 +0.1 0.70 perf-profile.children.cycles-pp.__x64_sys_madvise
> > 0.62 +0.1 0.70 perf-profile.children.cycles-pp.do_madvise
> > 0.00 +0.1 0.09 ą 4% perf-profile.children.cycles-pp.can_modify_mm_madv
> > 1.32 +0.1 1.46 perf-profile.children.cycles-pp.mas_next_slot
> > 88.13 +0.7 88.83 perf-profile.children.cycles-pp.mremap
> > 83.94 +0.9 84.88 perf-profile.children.cycles-pp.__do_sys_mremap
> > 86.06 +0.9 87.00 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
> > 85.56 +1.0 86.54 perf-profile.children.cycles-pp.do_syscall_64
> > 40.49 +1.4 41.90 perf-profile.children.cycles-pp.do_vmi_munmap
> > 2.10 +1.5 3.57 perf-profile.children.cycles-pp.do_munmap
> > 3.62 +2.3 5.90 perf-profile.children.cycles-pp.mas_walk
> > 5.44 +2.9 8.38 perf-profile.children.cycles-pp.mremap_to
> > 5.30 +3.1 8.39 perf-profile.children.cycles-pp.mas_find
> > 0.00 +5.4 5.40 perf-profile.children.cycles-pp.can_modify_mm
> > 11.46 -0.5 10.96 perf-profile.self.cycles-pp.__slab_free
> > 4.30 -0.2 4.08 perf-profile.self.cycles-pp.__memcg_slab_post_alloc_hook
> > 2.51 -0.2 2.34 perf-profile.self.cycles-pp.rcu_cblist_dequeue
> > 2.41 ą 2% -0.2 2.25 perf-profile.self.cycles-pp.down_write
> > 2.21 -0.1 2.11 perf-profile.self.cycles-pp.native_flush_tlb_one_user
> > 2.37 -0.1 2.28 perf-profile.self.cycles-pp.mtree_load
> > 1.60 -0.1 1.51 perf-profile.self.cycles-pp.__memcg_slab_free_hook
> > 0.18 ą 3% -0.1 0.10 ą 15% perf-profile.self.cycles-pp.vm_stat_account
> > 1.25 -0.1 1.18 perf-profile.self.cycles-pp.move_vma
> > 1.76 -0.1 1.69 perf-profile.self.cycles-pp.mod_objcg_state
> > 1.42 -0.1 1.35 ą 2% perf-profile.self.cycles-pp.__call_rcu_common
> > 1.41 -0.1 1.34 perf-profile.self.cycles-pp.mas_wr_walk
> > 1.52 -0.1 1.46 perf-profile.self.cycles-pp.up_write
> > 1.02 -0.1 0.95 perf-profile.self.cycles-pp.mas_prev_slot
> > 0.96 -0.1 0.90 ą 2% perf-profile.self.cycles-pp.vm_area_free_rcu_cb
> > 1.50 -0.1 1.45 perf-profile.self.cycles-pp.kmem_cache_free
> > 0.69 ą 3% -0.1 0.64 ą 2% perf-profile.self.cycles-pp.rcu_all_qs
> > 1.14 ą 2% -0.1 1.09 perf-profile.self.cycles-pp.shuffle_freelist
> > 1.10 -0.1 1.05 perf-profile.self.cycles-pp.__cond_resched
> > 1.40 -0.0 1.35 perf-profile.self.cycles-pp.do_vmi_align_munmap
> > 0.99 -0.0 0.94 perf-profile.self.cycles-pp.mas_preallocate
> > 0.88 -0.0 0.83 perf-profile.self.cycles-pp.___slab_alloc
> > 0.55 -0.0 0.50 perf-profile.self.cycles-pp.mremap_to
> > 0.98 -0.0 0.93 perf-profile.self.cycles-pp.move_ptes
> > 0.78 -0.0 0.74 perf-profile.self.cycles-pp.percpu_counter_add_batch
> > 0.21 ą 2% -0.0 0.18 ą 2% perf-profile.self.cycles-pp.entry_SYSCALL_64
> > 0.44 ą 2% -0.0 0.40 ą 2% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
> > 0.92 -0.0 0.89 perf-profile.self.cycles-pp.mas_store_gfp
> > 0.86 -0.0 0.82 perf-profile.self.cycles-pp.mas_pop_node
> > 0.50 -0.0 0.46 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
> > 1.15 -0.0 1.12 perf-profile.self.cycles-pp.clear_bhb_loop
> > 1.14 -0.0 1.11 perf-profile.self.cycles-pp.vma_merge
> > 0.66 -0.0 0.63 perf-profile.self.cycles-pp.__split_vma
> > 0.16 ą 6% -0.0 0.13 ą 7% perf-profile.self.cycles-pp.cap_vm_enough_memory
> > 0.82 -0.0 0.79 perf-profile.self.cycles-pp.mas_wr_store_entry
> > 0.54 ą 2% -0.0 0.52 perf-profile.self.cycles-pp.get_old_pud
> > 0.43 -0.0 0.40 perf-profile.self.cycles-pp.do_munmap
> > 0.51 ą 2% -0.0 0.48 ą 2% perf-profile.self.cycles-pp.security_mmap_addr
> > 0.50 -0.0 0.48 perf-profile.self.cycles-pp.refill_obj_stock
> > 0.24 -0.0 0.22 perf-profile.self.cycles-pp.mas_prev
> > 0.71 -0.0 0.69 perf-profile.self.cycles-pp.unmap_page_range
> > 0.48 -0.0 0.45 perf-profile.self.cycles-pp.find_vma_prev
> > 0.42 -0.0 0.40 perf-profile.self.cycles-pp._raw_spin_lock_irqsave
> > 0.66 -0.0 0.64 perf-profile.self.cycles-pp.mas_store_prealloc
> > 0.31 -0.0 0.29 perf-profile.self.cycles-pp.mas_prev_setup
> > 0.43 -0.0 0.41 perf-profile.self.cycles-pp.mas_wr_end_piv
> > 0.78 -0.0 0.76 perf-profile.self.cycles-pp.userfaultfd_unmap_complete
> > 0.28 -0.0 0.26 ą 2% perf-profile.self.cycles-pp.mas_put_in_tree
> > 0.42 -0.0 0.40 perf-profile.self.cycles-pp.mremap_userfaultfd_prep
> > 0.28 -0.0 0.26 perf-profile.self.cycles-pp.free_pgtables
> > 0.39 -0.0 0.37 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
> > 0.30 ą 2% -0.0 0.28 perf-profile.self.cycles-pp.zap_pmd_range
> > 0.32 -0.0 0.31 perf-profile.self.cycles-pp.unmap_vmas
> > 0.21 -0.0 0.20 perf-profile.self.cycles-pp.__get_unmapped_area
> > 0.18 ą 2% -0.0 0.17 ą 2% perf-profile.self.cycles-pp.lru_add_drain_cpu
> > 0.06 -0.0 0.05 perf-profile.self.cycles-pp.ksm_madvise
> > 0.45 +0.0 0.46 perf-profile.self.cycles-pp.do_vmi_munmap
> > 0.37 +0.0 0.39 perf-profile.self.cycles-pp._raw_spin_lock
> > 1.06 +0.1 1.18 perf-profile.self.cycles-pp.mas_next_slot
> > 1.50 +0.5 1.97 perf-profile.self.cycles-pp.mas_find
> > 0.00 +1.4 1.35 perf-profile.self.cycles-pp.can_modify_mm
> > 3.13 +2.0 5.13 perf-profile.self.cycles-pp.mas_walk
> >
> >
> > ***************************************************************************************************
> > lkp-spr-r02: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory
> > =========================================================================================
> > compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
> > gcc-13/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-spr-r02/pkey/stress-ng/60s
> >
> > commit:
> > ff388fe5c4 ("mseal: wire up mseal syscall")
> > 8be7258aad ("mseal: add mseal syscall")
> >
> > ff388fe5c481d39c 8be7258aad44b5e25977a98db13
> > ---------------- ---------------------------
> > %stddev %change %stddev
> > \ | \
> > 10539 -2.5% 10273 vmstat.system.cs
> > 0.28 ą 5% -20.1% 0.22 ą 7% sched_debug.cfs_rq:/.h_nr_running.stddev
> > 1419 ą 7% -15.3% 1202 ą 6% sched_debug.cfs_rq:/.util_avg.max
> > 0.28 ą 6% -18.4% 0.23 ą 8% sched_debug.cpu.nr_running.stddev
> > 8.736e+08 -3.6% 8.423e+08 stress-ng.pkey.ops
> > 14560560 -3.6% 14038795 stress-ng.pkey.ops_per_sec
> > 770.39 ą 4% -5.0% 732.04 stress-ng.time.user_time
> > 244657 ą 3% +5.8% 258782 ą 3% proc-vmstat.nr_slab_unreclaimable
> > 73133541 -2.1% 71588873 proc-vmstat.numa_hit
> > 72873579 -2.1% 71357274 proc-vmstat.numa_local
> > 1.842e+08 -2.5% 1.796e+08 proc-vmstat.pgalloc_normal
> > 1.767e+08 -2.8% 1.717e+08 proc-vmstat.pgfree
> > 1345346 ą 40% -73.1% 362064 ą124% numa-vmstat.node0.nr_inactive_anon
> > 1345340 ą 40% -73.1% 362062 ą124% numa-vmstat.node0.nr_zone_inactive_anon
> > 2420830 ą 14% +35.1% 3270248 ą 16% numa-vmstat.node1.nr_file_pages
> > 2067871 ą 13% +51.5% 3132982 ą 17% numa-vmstat.node1.nr_inactive_anon
> > 191406 ą 17% +33.6% 255808 ą 14% numa-vmstat.node1.nr_mapped
> > 2452 ą 61% +104.4% 5012 ą 35% numa-vmstat.node1.nr_page_table_pages
> > 2067853 ą 13% +51.5% 3132966 ą 17% numa-vmstat.node1.nr_zone_inactive_anon
> > 5379238 ą 40% -73.0% 1453605 ą123% numa-meminfo.node0.Inactive
> > 5379166 ą 40% -73.0% 1453462 ą123% numa-meminfo.node0.Inactive(anon)
> > 8741077 ą 22% -36.7% 5531290 ą 28% numa-meminfo.node0.MemUsed
> > 9651902 ą 13% +35.8% 13105318 ą 16% numa-meminfo.node1.FilePages
> > 8239855 ą 13% +52.4% 12556929 ą 17% numa-meminfo.node1.Inactive
> > 8239712 ą 13% +52.4% 12556853 ą 17% numa-meminfo.node1.Inactive(anon)
> > 761944 ą 18% +34.6% 1025906 ą 14% numa-meminfo.node1.Mapped
> > 11679628 ą 11% +31.2% 15322841 ą 14% numa-meminfo.node1.MemUsed
> > 9874 ą 62% +104.6% 20200 ą 36% numa-meminfo.node1.PageTables
> > 0.74 -4.2% 0.71 perf-stat.i.MPKI
> > 1.245e+11 +2.3% 1.274e+11 perf-stat.i.branch-instructions
> > 0.37 -0.0 0.35 perf-stat.i.branch-miss-rate%
> > 4.359e+08 -2.1% 4.265e+08 perf-stat.i.branch-misses
> > 4.672e+08 -2.6% 4.548e+08 perf-stat.i.cache-misses
> > 7.276e+08 -2.7% 7.082e+08 perf-stat.i.cache-references
> > 1.00 -1.6% 0.98 perf-stat.i.cpi
> > 1364 +2.9% 1404 perf-stat.i.cycles-between-cache-misses
> > 6.392e+11 +1.7% 6.499e+11 perf-stat.i.instructions
> > 1.00 +1.6% 1.02 perf-stat.i.ipc
> > 0.74 -4.3% 0.71 perf-stat.overall.MPKI
> > 0.35 -0.0 0.33 perf-stat.overall.branch-miss-rate%
> > 1.00 -1.6% 0.99 perf-stat.overall.cpi
> > 1356 +2.9% 1395 perf-stat.overall.cycles-between-cache-misses
> > 1.00 +1.6% 1.01 perf-stat.overall.ipc
> > 1.209e+11 +1.9% 1.232e+11 perf-stat.ps.branch-instructions
> > 4.188e+08 -2.6% 4.077e+08 perf-stat.ps.branch-misses
> > 4.585e+08 -3.1% 4.441e+08 perf-stat.ps.cache-misses
> > 7.124e+08 -3.1% 6.901e+08 perf-stat.ps.cache-references
> > 10321 -2.6% 10053 perf-stat.ps.context-switches
> >
> >
> >
> >
> >
> > Disclaimer:
> > Results have been estimated based on internal Intel analysis and are provided
> > for informational purposes only. Any difference in system hardware or software
> > design or configuration may affect actual performance.
> >
> >
> > --
> > 0-DAY CI Kernel Test Service
> > https://github.com/intel/lkp-tests/wiki
> >
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression
2024-08-06 0:13 ` Linus Torvalds
2024-08-06 1:22 ` Jeff Xu
@ 2024-08-06 2:01 ` Michael Ellerman
2024-08-06 2:15 ` Linus Torvalds
2024-09-13 5:47 ` Christophe Leroy
1 sibling, 2 replies; 29+ messages in thread
From: Michael Ellerman @ 2024-08-06 2:01 UTC (permalink / raw)
To: Linus Torvalds, Nicholas Piggin
Cc: Jeff Xu, Christophe Leroy, Pedro Falcato, kernel test robot,
Jeff Xu, oe-lkp, lkp, linux-kernel, Andrew Morton, Kees Cook,
Liam R. Howlett, Dave Hansen, Greg Kroah-Hartman, Guenter Roeck,
Jann Horn, Jonathan Corbet, Jorge Lucangeli Obes, Matthew Wilcox,
Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan,
Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api,
linux-mm, ying.huang, feng.tang, fengwei.yin
Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Mon, 5 Aug 2024 at 16:25, Nicholas Piggin <npiggin@gmail.com> wrote:
>>
>> Can userspace on other archs not unmap their vdsos?
>
> I think they can, and nobody cares. The "context.vdso" value stays at
> some stale value, and anybody who tries to use it will just fail.
>
> So what makes powerpc special is not "you can unmap the vdso", but
> "powerpc cares".
>
> I just don't quite know _why_ powerpc cares.
AFAIK for CRIU the problem is signal delivery:
arch/powerpc/kernel/signal_64.c:
int handle_rt_signal64(struct ksignal *ksig, sigset_t *set,
struct task_struct *tsk)
{
...
/* Set up to return from userspace. */
if (tsk->mm->context.vdso) {
regs_set_return_ip(regs, VDSO64_SYMBOL(tsk->mm->context.vdso, sigtramp_rt64));
ie. if the VDSO is moved but mm->context.vdso is not updated, signal
delivery will crash in userspace.
x86-64 always uses SA_RESTORER, and arm64 & s390 can use SA_RESTORER, so
I think CRIU uses that to avoid problems with signal delivery when the
VDSO is moved.
riscv doesn't support SA_RESTORER but I guess CRIU doesn't support riscv
yet so it's not become a problem.
There was a patch to support SA_RESTORER on powerpc, but I balked at
merging it because I couldn't find anyone on the glibc side to say
whether they wanted it or not. I guess I should have just merged it.
There was an attempt to unify all the vdso stuff and handle the
VDSO mremap case in generic code:
https://lore.kernel.org/lkml/20210611180242.711399-17-dima@arista.com/
But I think that series got a bit big and complicated and Dmitry had to
move on to other things.
cheers
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression
2024-08-05 19:33 ` Linus Torvalds
@ 2024-08-06 2:14 ` Michael Ellerman
2024-08-06 2:17 ` Linus Torvalds
2024-08-06 6:04 ` Oliver Sang
1 sibling, 1 reply; 29+ messages in thread
From: Michael Ellerman @ 2024-08-06 2:14 UTC (permalink / raw)
To: Linus Torvalds, Jeff Xu, Nicholas Piggin, Christophe Leroy
Cc: Pedro Falcato, kernel test robot, Jeff Xu, oe-lkp, lkp,
linux-kernel, Andrew Morton, Kees Cook, Liam R. Howlett,
Dave Hansen, Greg Kroah-Hartman, Guenter Roeck, Jann Horn,
Jonathan Corbet, Jorge Lucangeli Obes, Matthew Wilcox,
Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan,
Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api,
linux-mm, ying.huang, feng.tang, fengwei.yin
Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Mon, 5 Aug 2024 at 11:55, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>>
>> So please consider this a "maybe something like this" patch, but that
>> 'arch_unmap()' really is pretty nasty
>
> Actually, the whole powerpc vdso code confused me. It's not the vvar
> thing that wants this close thing, it's the other ones that have the
> remap thing.
>
> .. and there were two of those error cases that needed to reset the
> vdso pointer.
>
> That all shows just how carefully I was reading this code.
>
> New version - still untested, but now I've read through it one more
> time - attached.
Needs a slight tweak to compile, vvar_close() needs to return void. And
should probably be renamed vdso_close(). Diff below if anyone else wants
to test it.
I'm testing it now, but it should do what we need.
cheers
diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c
index 6fa041a6690a..431b46976db8 100644
--- a/arch/powerpc/kernel/vdso.c
+++ b/arch/powerpc/kernel/vdso.c
@@ -81,8 +81,8 @@ static int vdso64_mremap(const struct vm_special_mapping *sm, struct vm_area_str
return vdso_mremap(sm, new_vma, &vdso64_end - &vdso64_start);
}
-static int vvar_close(const struct vm_special_mapping *sm,
- struct vm_area_struct *vma)
+static void vdso_close(const struct vm_special_mapping *sm,
+ struct vm_area_struct *vma)
{
struct mm_struct *mm = vma->vm_mm;
mm->context.vdso = NULL;
@@ -99,13 +99,13 @@ static struct vm_special_mapping vvar_spec __ro_after_init = {
static struct vm_special_mapping vdso32_spec __ro_after_init = {
.name = "[vdso]",
.mremap = vdso32_mremap,
- .close = vvar_close,
+ .close = vdso_close,
};
static struct vm_special_mapping vdso64_spec __ro_after_init = {
.name = "[vdso]",
.mremap = vdso64_mremap,
- .close = vvar_close,
+ .close = vdso_close,
};
#ifdef CONFIG_TIME_NS
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression
2024-08-06 2:01 ` Michael Ellerman
@ 2024-08-06 2:15 ` Linus Torvalds
2024-09-13 5:47 ` Christophe Leroy
1 sibling, 0 replies; 29+ messages in thread
From: Linus Torvalds @ 2024-08-06 2:15 UTC (permalink / raw)
To: Michael Ellerman
Cc: Nicholas Piggin, Jeff Xu, Christophe Leroy, Pedro Falcato,
kernel test robot, Jeff Xu, oe-lkp, lkp, linux-kernel,
Andrew Morton, Kees Cook, Liam R. Howlett, Dave Hansen,
Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jonathan Corbet,
Jorge Lucangeli Obes, Matthew Wilcox, Muhammad Usama Anjum,
Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany,
Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang,
feng.tang, fengwei.yin
On Mon, 5 Aug 2024 at 19:01, Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> >
> > I just don't quite know _why_ powerpc cares.
>
> AFAIK for CRIU the problem is signal delivery:
Hmm. Well, the patch I sent out should keep it all working.
In fact, to some degree it would make it much more straightforward for
other architectures to do the same thing.
Instead of a random "arch_munmap()" hack, it's a fairly reasonable
_install_special_mapping() extension.
That said, the *other* thing I don't really understand is the strange
"we have to set the context.vdso value before calling
install_special_mapping":
/*
* Put vDSO base into mm struct. We need to do this before calling
* install_special_mapping or the perf counter mmap tracking code
* will fail to recognise it as a vDSO.
*/
and that looks odd too.
Anyway, I wish we could just get rid of all the horrible signal restore crap.
We used to just put it in the stack, and that worked really well apart
from the whole WX thing.
I wonder if we should just go back to that, and turn the resulting
"page fault due to non-executable stack" into a "sigreturn system
call".
And yes, SA_RESTORER is the right thing. It's basically just user
space telling us where it is. And happily, on x86-64 we just forced
the issue, and we do
/* x86-64 should always use SA_RESTORER. */
if (!(ksig->ka.sa.sa_flags & SA_RESTORER))
return -EFAULT;
so you literally cannot have signals without it.
Linus
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression
2024-08-06 2:14 ` Michael Ellerman
@ 2024-08-06 2:17 ` Linus Torvalds
2024-08-06 12:03 ` Michael Ellerman
0 siblings, 1 reply; 29+ messages in thread
From: Linus Torvalds @ 2024-08-06 2:17 UTC (permalink / raw)
To: Michael Ellerman
Cc: Jeff Xu, Nicholas Piggin, Christophe Leroy, Pedro Falcato,
kernel test robot, Jeff Xu, oe-lkp, lkp, linux-kernel,
Andrew Morton, Kees Cook, Liam R. Howlett, Dave Hansen,
Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jonathan Corbet,
Jorge Lucangeli Obes, Matthew Wilcox, Muhammad Usama Anjum,
Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany,
Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang,
feng.tang, fengwei.yin
On Mon, 5 Aug 2024 at 19:14, Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> Needs a slight tweak to compile, vvar_close() needs to return void.
Ack, shows just how untested it was.
> And should probably be renamed vdso_close().
.. and that was due to the initial confusion that I then fixed, but
didn't fix the naming.
So yes, those fixes look ObviouslyCorrect(tm).
Linus
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression
2024-08-05 19:33 ` Linus Torvalds
2024-08-06 2:14 ` Michael Ellerman
@ 2024-08-06 6:04 ` Oliver Sang
2024-08-06 14:38 ` Linus Torvalds
2024-08-06 21:37 ` Pedro Falcato
1 sibling, 2 replies; 29+ messages in thread
From: Oliver Sang @ 2024-08-06 6:04 UTC (permalink / raw)
To: Linus Torvalds
Cc: Jeff Xu, Michael Ellerman, Nicholas Piggin, Christophe Leroy,
Pedro Falcato, Jeff Xu, oe-lkp, lkp, linux-kernel, Andrew Morton,
Kees Cook, Liam R. Howlett, Dave Hansen, Greg Kroah-Hartman,
Guenter Roeck, Jann Horn, Jonathan Corbet, Jorge Lucangeli Obes,
Matthew Wilcox, Muhammad Usama Anjum, Stephen Röttger,
Suren Baghdasaryan, Amer Al Shanawany, Javier Carrasco,
Shuah Khan, linux-api, linux-mm, ying.huang, feng.tang,
fengwei.yin, oliver.sang
hi, Linus,
On Mon, Aug 05, 2024 at 12:33:58PM -0700, Linus Torvalds wrote:
> On Mon, 5 Aug 2024 at 11:55, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > So please consider this a "maybe something like this" patch, but that
> > 'arch_unmap()' really is pretty nasty
>
> Actually, the whole powerpc vdso code confused me. It's not the vvar
> thing that wants this close thing, it's the other ones that have the
> remap thing.
>
> .. and there were two of those error cases that needed to reset the
> vdso pointer.
>
> That all shows just how carefully I was reading this code.
>
> New version - still untested, but now I've read through it one more
> time - attached.
we tested this version by applying it directly upon 8be7258aad, but seems it
have little impact to performance. still similar regression if comparing to
ff388fe5c4.
(the data for 8be7258aad and ff388fe5c4 are a little different with what we
have in previous report, since we rerun tests by gcc-12 compiler. 0day team
change back to gcc-12 from gcc-13 recently due to some issues)
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-12/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/pagemove/stress-ng/60s
commit:
ff388fe5c4 ("mseal: wire up mseal syscall")
8be7258aad ("mseal: add mseal syscall")
4605212a16 <--- your patch
ff388fe5c481d39c 8be7258aad44b5e25977a98db13 4605212a162071afdd9c713e936
---------------- --------------------------- ---------------------------
%stddev %change %stddev %change %stddev
\ | \ | \
4958 +1.3% 5024 +1.2% 5020 time.percent_of_cpu_this_job_got
2916 +1.5% 2960 +1.4% 2957 time.system_time
65.85 -7.0% 61.27 -7.0% 61.23 time.user_time
41535129 -4.5% 39669773 -4.3% 39746835 proc-vmstat.numa_hit
41465484 -4.5% 39602956 -4.3% 39677556 proc-vmstat.numa_local
77303973 -4.6% 73780662 -4.4% 73912128 proc-vmstat.pgalloc_normal
77022096 -4.6% 73502058 -4.4% 73637326 proc-vmstat.pgfree
18381956 -4.9% 17473438 -5.0% 17457167 stress-ng.pagemove.ops
306349 -4.9% 291188 -5.0% 290931 stress-ng.pagemove.ops_per_sec
209930 -6.2% 196996 ± 2% -7.6% 193911 stress-ng.pagemove.page_remaps_per_sec
4958 +1.3% 5024 +1.2% 5020 stress-ng.time.percent_of_cpu_this_job_got
2916 +1.5% 2960 +1.4% 2957 stress-ng.time.system_time
1.13 -2.1% 1.10 -2.2% 1.10 perf-stat.i.cpi
0.89 +2.2% 0.91 +2.1% 0.91 perf-stat.i.ipc
1.04 -7.2% 0.97 -7.1% 0.97 perf-stat.overall.MPKI
1.13 -2.3% 1.10 -2.2% 1.10 perf-stat.overall.cpi
1082 +5.4% 1140 +5.3% 1139 perf-stat.overall.cycles-between-cache-misses
0.89 +2.3% 0.91 +2.3% 0.91 perf-stat.overall.ipc
192.79 -3.9% 185.32 ± 2% -2.4% 188.21 ± 3% perf-stat.ps.cpu-migrations
1.048e+13 +2.8% 1.078e+13 +2.6% 1.075e+13 perf-stat.total.instructions
74.97 -1.9 73.07 -2.1 72.88 perf-profile.calltrace.cycles-pp.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
36.79 -1.6 35.22 -1.6 35.17 perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64
24.98 -1.3 23.64 -1.4 23.57 perf-profile.calltrace.cycles-pp.copy_vma.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
19.91 -1.1 18.85 -1.1 18.83 perf-profile.calltrace.cycles-pp.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
10.70 ± 3% -0.9 9.84 ± 3% -0.6 10.06 ± 2% perf-profile.calltrace.cycles-pp.kthread.ret_from_fork.ret_from_fork_asm
10.70 ± 3% -0.9 9.84 ± 3% -0.6 10.06 ± 2% perf-profile.calltrace.cycles-pp.ret_from_fork.ret_from_fork_asm
10.70 ± 3% -0.9 9.84 ± 3% -0.6 10.06 ± 2% perf-profile.calltrace.cycles-pp.ret_from_fork_asm
10.64 ± 3% -0.9 9.79 ± 3% -0.6 10.02 ± 2% perf-profile.calltrace.cycles-pp.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
10.63 ± 3% -0.9 9.78 ± 3% -0.6 10.01 ± 2% perf-profile.calltrace.cycles-pp.handle_softirqs.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork
10.63 ± 3% -0.9 9.78 ± 3% -0.6 10.01 ± 2% perf-profile.calltrace.cycles-pp.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
10.63 ± 3% -0.9 9.78 ± 3% -0.6 10.01 ± 2% perf-profile.calltrace.cycles-pp.rcu_core.handle_softirqs.run_ksoftirqd.smpboot_thread_fn.kthread
10.59 ± 3% -0.8 9.74 ± 3% -0.6 9.97 ± 2% perf-profile.calltrace.cycles-pp.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd.smpboot_thread_fn
14.77 -0.8 14.00 -0.9 13.91 perf-profile.calltrace.cycles-pp.vma_merge.copy_vma.move_vma.__do_sys_mremap.do_syscall_64
1.48 -0.5 0.99 -0.5 0.99 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64
5.95 ± 3% -0.5 5.47 ± 3% -0.4 5.59 ± 2% perf-profile.calltrace.cycles-pp.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd
7.88 -0.4 7.48 -0.4 7.44 perf-profile.calltrace.cycles-pp.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
4.62 ± 3% -0.4 4.25 ± 3% -0.3 4.35 ± 2% perf-profile.calltrace.cycles-pp.__slab_free.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs
6.72 -0.4 6.36 -0.3 6.39 perf-profile.calltrace.cycles-pp.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma
6.15 -0.3 5.82 -0.4 5.80 perf-profile.calltrace.cycles-pp.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma
6.11 -0.3 5.78 -0.3 5.82 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap
5.78 -0.3 5.49 -0.3 5.46 perf-profile.calltrace.cycles-pp.move_ptes.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64
5.54 -0.3 5.25 -0.3 5.22 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap
5.56 -0.3 5.28 -0.3 5.24 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_merge.copy_vma.move_vma.__do_sys_mremap
5.19 -0.3 4.92 -0.3 4.89 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap
5.20 -0.3 4.94 -0.3 4.91 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma.move_vma
3.20 ± 4% -0.3 2.94 ± 3% -0.2 3.01 ± 2% perf-profile.calltrace.cycles-pp.vm_area_free_rcu_cb.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd
4.09 -0.2 3.85 -0.2 3.85 perf-profile.calltrace.cycles-pp.vm_area_dup.copy_vma.move_vma.__do_sys_mremap.do_syscall_64
4.68 -0.2 4.45 -0.3 4.41 perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma
2.63 ± 3% -0.2 2.42 ± 3% -0.2 2.48 ± 2% perf-profile.calltrace.cycles-pp.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core.handle_softirqs
2.36 ± 2% -0.2 2.16 ± 4% -0.2 2.17 ± 2% perf-profile.calltrace.cycles-pp.__memcpy.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_complete
3.56 -0.2 3.36 -0.2 3.37 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma.__do_sys_mremap
4.00 -0.2 3.81 -0.2 3.78 perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma
1.35 -0.2 1.16 -0.2 1.17 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap
3.40 -0.2 3.22 -0.2 3.21 perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma.__do_sys_mremap
2.22 -0.2 2.06 -0.2 2.05 perf-profile.calltrace.cycles-pp.find_vma_prev.copy_vma.move_vma.__do_sys_mremap.do_syscall_64
0.96 -0.2 0.82 -0.1 0.82 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.do_munmap.mremap_to
3.25 -0.1 3.10 -0.2 3.10 perf-profile.calltrace.cycles-pp.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
1.81 ± 4% -0.1 1.67 ± 3% -0.1 1.71 ± 2% perf-profile.calltrace.cycles-pp.__slab_free.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core
1.97 ± 3% -0.1 1.83 ± 3% -0.2 1.81 ± 4% perf-profile.calltrace.cycles-pp.mod_objcg_state.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.__split_vma
2.26 -0.1 2.12 -0.2 2.11 perf-profile.calltrace.cycles-pp.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma
3.10 -0.1 2.96 -0.1 2.99 perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap
3.13 -0.1 2.99 -0.1 3.00 perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma
2.97 -0.1 2.85 -0.2 2.82 perf-profile.calltrace.cycles-pp.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma
2.05 -0.1 1.93 -0.1 1.92 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_merge.copy_vma.move_vma.__do_sys_mremap
8.26 -0.1 8.14 -0.1 8.16 perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
2.45 -0.1 2.34 -0.1 2.34 perf-profile.calltrace.cycles-pp.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma
2.43 -0.1 2.32 -0.1 2.32 perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap
1.75 ± 2% -0.1 1.64 ± 3% -0.1 1.61 ± 3% perf-profile.calltrace.cycles-pp.vma_prepare.vma_merge.copy_vma.move_vma.__do_sys_mremap
0.54 -0.1 0.44 ± 37% -0.2 0.36 ± 63% perf-profile.calltrace.cycles-pp.mas_find.find_vma_prev.copy_vma.move_vma.__do_sys_mremap
2.21 -0.1 2.11 -0.1 2.11 perf-profile.calltrace.cycles-pp.native_flush_tlb_one_user.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables
1.27 ± 2% -0.1 1.16 ± 4% -0.1 1.18 ± 2% perf-profile.calltrace.cycles-pp.__memcpy.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge
1.32 ± 3% -0.1 1.22 ± 3% -0.1 1.25 perf-profile.calltrace.cycles-pp.rcu_cblist_dequeue.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd
1.85 -0.1 1.76 -0.1 1.76 perf-profile.calltrace.cycles-pp.vma_link.copy_vma.move_vma.__do_sys_mremap.do_syscall_64
2.14 ± 2% -0.1 2.05 ± 2% -0.1 2.00 ± 2% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap
1.40 -0.1 1.31 -0.1 1.30 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap
1.77 ± 3% -0.1 1.68 ± 2% -0.1 1.64 ± 2% perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.anon_vma_clone.__split_vma.do_vmi_align_munmap
1.39 -0.1 1.30 -0.1 1.30 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma.move_vma
1.24 -0.1 1.16 -0.1 1.16 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap
1.40 ± 3% -0.1 1.32 ± 4% -0.1 1.26 ± 5% perf-profile.calltrace.cycles-pp.down_write.vma_prepare.vma_merge.copy_vma.move_vma
0.94 -0.1 0.86 -0.1 0.86 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap
1.23 -0.1 1.15 -0.1 1.15 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma
1.54 -0.1 1.46 -0.1 1.46 perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap
0.73 -0.1 0.67 -0.1 0.67 perf-profile.calltrace.cycles-pp.mas_walk.find_vma_prev.copy_vma.move_vma.__do_sys_mremap
1.15 -0.1 1.09 -0.1 1.09 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap
0.60 ± 2% -0.1 0.54 -0.1 0.54 perf-profile.calltrace.cycles-pp.security_mmap_addr.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64
1.27 -0.1 1.21 -0.1 1.21 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma
0.72 -0.1 0.66 -0.1 0.65 perf-profile.calltrace.cycles-pp.mas_prev.vma_merge.copy_vma.move_vma.__do_sys_mremap
0.70 ± 2% -0.1 0.64 ± 3% -0.1 0.64 ± 4% perf-profile.calltrace.cycles-pp.mod_objcg_state.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.copy_vma
0.79 -0.1 0.73 -0.1 0.73 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma
0.80 ± 2% -0.1 0.75 -0.1 0.74 ± 2% perf-profile.calltrace.cycles-pp.__call_rcu_common.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge
0.78 -0.1 0.72 -0.1 0.72 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge
1.02 -0.1 0.96 -0.1 0.96 perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region
1.63 -0.1 1.58 -0.1 1.57 perf-profile.calltrace.cycles-pp.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.62 -0.0 0.58 -0.1 0.57 perf-profile.calltrace.cycles-pp.mas_prev_setup.mas_prev.vma_merge.copy_vma.move_vma
0.60 ± 3% -0.0 0.56 ± 3% -0.0 0.57 ± 2% perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core
0.86 -0.0 0.81 -0.0 0.81 perf-profile.calltrace.cycles-pp.mtree_load.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64
0.67 -0.0 0.62 -0.0 0.63 perf-profile.calltrace.cycles-pp.percpu_counter_add_batch.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
1.02 -0.0 0.97 -0.0 0.97 perf-profile.calltrace.cycles-pp.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.76 ± 2% -0.0 0.71 -0.0 0.72 ± 2% perf-profile.calltrace.cycles-pp.allocate_slab.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma
0.70 -0.0 0.66 -0.0 0.66 perf-profile.calltrace.cycles-pp.__call_rcu_common.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
0.67 ± 2% -0.0 0.63 -0.0 0.63 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.mremap
0.81 -0.0 0.77 -0.0 0.77 perf-profile.calltrace.cycles-pp.mtree_load.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
0.56 -0.0 0.51 -0.1 0.44 ± 40% perf-profile.calltrace.cycles-pp.mas_walk.mas_prev_setup.mas_prev.vma_merge.copy_vma
0.98 -0.0 0.93 -0.0 0.93 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
0.69 -0.0 0.65 -0.0 0.65 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma
0.78 -0.0 0.74 -0.0 0.74 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_link.copy_vma.move_vma.__do_sys_mremap
1.12 -0.0 1.08 -0.0 1.07 perf-profile.calltrace.cycles-pp.clear_bhb_loop.mremap
0.68 -0.0 0.65 -0.0 0.65 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_link.copy_vma.move_vma.__do_sys_mremap
1.00 -0.0 0.97 -0.0 0.97 perf-profile.calltrace.cycles-pp.mt_find.vma_merge.copy_vma.move_vma.__do_sys_mremap
0.62 -0.0 0.59 -0.0 0.59 perf-profile.calltrace.cycles-pp.mas_prev_slot.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
0.88 -0.0 0.85 -0.0 0.85 perf-profile.calltrace.cycles-pp.userfaultfd_unmap_complete.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
1.15 -0.0 1.12 -0.0 1.14 perf-profile.calltrace.cycles-pp.anon_vma_clone.copy_vma.move_vma.__do_sys_mremap.do_syscall_64
0.60 -0.0 0.57 ± 2% -0.0 0.57 perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
0.59 -0.0 0.56 -0.0 0.55 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.mremap
0.62 ± 2% -0.0 0.59 ± 2% -0.0 0.58 perf-profile.calltrace.cycles-pp.get_old_pud.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64
0.65 -0.0 0.63 -0.0 0.62 perf-profile.calltrace.cycles-pp.mas_update_gap.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma
0.67 +0.1 0.74 +0.1 0.74 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__madvise
0.76 +0.1 0.84 +0.1 0.83 perf-profile.calltrace.cycles-pp.__madvise
0.66 +0.1 0.74 +0.1 0.74 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
0.63 +0.1 0.71 +0.1 0.71 perf-profile.calltrace.cycles-pp.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
0.62 +0.1 0.70 +0.1 0.70 perf-profile.calltrace.cycles-pp.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
3.47 +0.1 3.55 +0.1 3.56 perf-profile.calltrace.cycles-pp.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma
87.67 +0.8 88.47 +0.6 88.26 perf-profile.calltrace.cycles-pp.mremap
0.00 +0.9 0.86 +0.9 0.86 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.do_munmap
0.00 +0.9 0.88 +0.9 0.88 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.mremap_to.__do_sys_mremap
0.00 +0.9 0.90 ± 2% +0.9 0.89 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.move_vma
84.82 +1.0 85.80 +0.8 85.60 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.mremap
84.66 +1.0 85.65 +0.8 85.45 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
83.71 +1.0 84.73 +0.8 84.55 perf-profile.calltrace.cycles-pp.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
0.00 +1.1 1.10 +1.1 1.10 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64
0.00 +1.2 1.21 +1.2 1.22 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to
2.09 +1.5 3.60 +1.5 3.60 perf-profile.calltrace.cycles-pp.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.00 +1.5 1.51 +1.5 1.49 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap
1.59 +1.5 3.12 +1.5 3.13 perf-profile.calltrace.cycles-pp.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64
0.00 +1.6 1.62 +1.6 1.62 perf-profile.calltrace.cycles-pp.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.00 +1.7 1.72 +1.7 1.73 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap
0.00 +2.0 2.01 +2.0 1.99 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64
5.34 +3.0 8.38 +3.0 8.37 perf-profile.calltrace.cycles-pp.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
75.13 -1.9 73.22 -2.1 73.04 perf-profile.children.cycles-pp.move_vma
37.01 -1.6 35.43 -1.6 35.38 perf-profile.children.cycles-pp.do_vmi_align_munmap
25.06 -1.3 23.71 -1.4 23.65 perf-profile.children.cycles-pp.copy_vma
20.00 -1.1 18.94 -1.1 18.91 perf-profile.children.cycles-pp.__split_vma
19.86 -1.0 18.87 -1.0 18.89 perf-profile.children.cycles-pp.rcu_core
19.84 -1.0 18.85 -1.0 18.87 perf-profile.children.cycles-pp.rcu_do_batch
19.88 -1.0 18.89 -1.0 18.91 perf-profile.children.cycles-pp.handle_softirqs
10.70 ± 3% -0.9 9.84 ± 3% -0.6 10.06 ± 2% perf-profile.children.cycles-pp.kthread
10.70 ± 3% -0.9 9.84 ± 3% -0.6 10.06 ± 2% perf-profile.children.cycles-pp.ret_from_fork
10.70 ± 3% -0.9 9.84 ± 3% -0.6 10.06 ± 2% perf-profile.children.cycles-pp.ret_from_fork_asm
10.64 ± 3% -0.9 9.79 ± 3% -0.6 10.02 ± 2% perf-profile.children.cycles-pp.smpboot_thread_fn
10.63 ± 3% -0.9 9.78 ± 3% -0.6 10.01 ± 2% perf-profile.children.cycles-pp.run_ksoftirqd
17.53 -0.8 16.70 -0.8 16.72 perf-profile.children.cycles-pp.kmem_cache_free
15.28 -0.8 14.47 -0.8 14.48 perf-profile.children.cycles-pp.kmem_cache_alloc_noprof
15.16 -0.8 14.37 -0.9 14.29 perf-profile.children.cycles-pp.vma_merge
12.18 -0.6 11.54 -0.7 11.49 perf-profile.children.cycles-pp.mas_wr_store_entry
11.98 -0.6 11.36 -0.7 11.30 perf-profile.children.cycles-pp.mas_store_prealloc
12.11 -0.6 11.51 -0.6 11.51 perf-profile.children.cycles-pp.__slab_free
10.86 -0.6 10.26 -0.6 10.30 perf-profile.children.cycles-pp.vm_area_dup
9.89 -0.5 9.40 -0.6 9.33 perf-profile.children.cycles-pp.mas_wr_node_store
8.36 -0.4 7.92 -0.4 7.91 perf-profile.children.cycles-pp.__memcg_slab_post_alloc_hook
7.98 -0.4 7.58 -0.4 7.55 perf-profile.children.cycles-pp.move_page_tables
6.69 -0.4 6.33 -0.4 6.32 perf-profile.children.cycles-pp.vma_complete
5.86 -0.3 5.56 -0.3 5.53 perf-profile.children.cycles-pp.move_ptes
5.11 -0.3 4.81 -0.3 4.80 perf-profile.children.cycles-pp.mas_preallocate
6.05 -0.3 5.75 -0.3 5.76 perf-profile.children.cycles-pp.vm_area_free_rcu_cb
2.98 ± 2% -0.3 2.73 ± 4% -0.2 2.75 ± 2% perf-profile.children.cycles-pp.__memcpy
3.48 -0.2 3.26 -0.2 3.27 perf-profile.children.cycles-pp.___slab_alloc
3.46 ± 2% -0.2 3.26 -0.2 3.27 ± 2% perf-profile.children.cycles-pp.mod_objcg_state
2.91 -0.2 2.73 -0.2 2.73 perf-profile.children.cycles-pp.mas_alloc_nodes
2.43 -0.2 2.25 -0.2 2.25 perf-profile.children.cycles-pp.find_vma_prev
3.47 -0.2 3.29 -0.2 3.23 ± 2% perf-profile.children.cycles-pp.down_write
3.46 -0.2 3.28 -0.2 3.27 perf-profile.children.cycles-pp.flush_tlb_mm_range
4.22 -0.2 4.06 -0.2 4.05 perf-profile.children.cycles-pp.anon_vma_clone
3.32 -0.2 3.17 -0.1 3.18 perf-profile.children.cycles-pp.__memcg_slab_free_hook
3.35 -0.2 3.20 -0.2 3.20 perf-profile.children.cycles-pp.mas_store_gfp
2.22 -0.1 2.07 -0.1 2.07 perf-profile.children.cycles-pp.__cond_resched
3.18 -0.1 3.04 -0.1 3.05 perf-profile.children.cycles-pp.unmap_vmas
2.05 ± 2% -0.1 1.91 -0.1 1.93 ± 2% perf-profile.children.cycles-pp.allocate_slab
2.24 -0.1 2.11 ± 2% -0.2 2.08 ± 2% perf-profile.children.cycles-pp.vma_prepare
2.12 -0.1 2.00 -0.1 2.00 perf-profile.children.cycles-pp.__call_rcu_common
2.66 -0.1 2.53 -0.1 2.54 perf-profile.children.cycles-pp.mtree_load
2.46 -0.1 2.34 -0.1 2.34 perf-profile.children.cycles-pp.rcu_cblist_dequeue
2.49 -0.1 2.38 -0.1 2.38 perf-profile.children.cycles-pp.flush_tlb_func
8.32 -0.1 8.21 -0.1 8.22 perf-profile.children.cycles-pp.unmap_region
2.48 -0.1 2.37 -0.1 2.37 perf-profile.children.cycles-pp.unmap_page_range
2.23 -0.1 2.13 -0.1 2.13 perf-profile.children.cycles-pp.native_flush_tlb_one_user
1.77 -0.1 1.67 -0.1 1.67 perf-profile.children.cycles-pp.mas_wr_walk
1.88 -0.1 1.78 -0.1 1.78 perf-profile.children.cycles-pp.vma_link
1.40 -0.1 1.31 -0.1 1.31 perf-profile.children.cycles-pp.shuffle_freelist
1.84 -0.1 1.75 -0.1 1.76 ± 2% perf-profile.children.cycles-pp.up_write
0.97 ± 2% -0.1 0.88 -0.1 0.88 perf-profile.children.cycles-pp.rcu_all_qs
1.03 -0.1 0.95 -0.1 0.95 perf-profile.children.cycles-pp.mas_prev
0.92 -0.1 0.85 -0.1 0.84 perf-profile.children.cycles-pp.mas_prev_setup
1.58 -0.1 1.50 -0.1 1.50 perf-profile.children.cycles-pp.zap_pmd_range
1.24 -0.1 1.17 -0.1 1.16 perf-profile.children.cycles-pp.mas_prev_slot
1.58 -0.1 1.51 -0.1 1.51 perf-profile.children.cycles-pp.mas_update_gap
0.62 -0.1 0.56 -0.1 0.56 perf-profile.children.cycles-pp.security_mmap_addr
0.49 ± 2% -0.1 0.43 -0.1 0.44 ± 2% perf-profile.children.cycles-pp.setup_object
0.90 -0.1 0.84 -0.1 0.85 perf-profile.children.cycles-pp.percpu_counter_add_batch
0.98 -0.1 0.92 -0.1 0.93 perf-profile.children.cycles-pp.mas_pop_node
0.85 -0.1 0.80 -0.1 0.79 perf-profile.children.cycles-pp._raw_spin_lock_irqsave
1.68 -0.1 1.62 -0.1 1.61 perf-profile.children.cycles-pp.__get_unmapped_area
1.23 -0.1 1.18 -0.1 1.17 perf-profile.children.cycles-pp.__pte_offset_map_lock
1.08 -0.1 1.03 -0.1 1.03 perf-profile.children.cycles-pp.zap_pte_range
0.69 ± 2% -0.0 0.64 -0.0 0.65 perf-profile.children.cycles-pp.syscall_return_via_sysret
1.04 -0.0 1.00 -0.1 0.99 perf-profile.children.cycles-pp.vma_to_resize
1.08 -0.0 1.04 -0.0 1.04 perf-profile.children.cycles-pp.mas_leaf_max_gap
0.51 ± 3% -0.0 0.47 -0.0 0.49 ± 4% perf-profile.children.cycles-pp.anon_vma_interval_tree_insert
1.18 -0.0 1.14 -0.1 1.13 perf-profile.children.cycles-pp.clear_bhb_loop
0.57 -0.0 0.53 -0.0 0.53 perf-profile.children.cycles-pp.mas_wr_end_piv
0.43 -0.0 0.40 -0.0 0.39 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
1.14 -0.0 1.10 -0.0 1.10 perf-profile.children.cycles-pp.mt_find
0.46 ± 7% -0.0 0.42 ± 2% -0.0 0.42 perf-profile.children.cycles-pp._raw_spin_lock
0.62 -0.0 0.58 -0.0 0.59 perf-profile.children.cycles-pp.__put_partials
0.90 -0.0 0.87 -0.0 0.87 perf-profile.children.cycles-pp.userfaultfd_unmap_complete
0.46 ± 3% -0.0 0.42 ± 3% -0.0 0.42 ± 3% perf-profile.children.cycles-pp.__alloc_pages_noprof
0.61 -0.0 0.58 -0.0 0.58 perf-profile.children.cycles-pp.entry_SYSCALL_64
0.48 -0.0 0.45 ± 2% -0.0 0.45 perf-profile.children.cycles-pp.mas_prev_range
0.64 -0.0 0.61 -0.0 0.60 perf-profile.children.cycles-pp.get_old_pud
0.31 ± 2% -0.0 0.28 ± 3% -0.0 0.28 ± 3% perf-profile.children.cycles-pp.security_vm_enough_memory_mm
0.33 ± 3% -0.0 0.30 ± 2% -0.0 0.30 ± 3% perf-profile.children.cycles-pp.mas_put_in_tree
0.32 ± 2% -0.0 0.29 ± 2% -0.0 0.30 ± 2% perf-profile.children.cycles-pp.tlb_finish_mmu
0.47 -0.0 0.44 ± 2% -0.0 0.44 perf-profile.children.cycles-pp.rcu_segcblist_enqueue
0.33 -0.0 0.31 -0.0 0.32 perf-profile.children.cycles-pp.mas_destroy
0.40 -0.0 0.39 -0.0 0.38 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
0.35 -0.0 0.34 -0.0 0.33 perf-profile.children.cycles-pp.__rb_insert_augmented
0.25 ± 4% -0.0 0.23 ± 3% -0.0 0.23 ± 3% perf-profile.children.cycles-pp.rmqueue
0.39 -0.0 0.37 -0.0 0.37 perf-profile.children.cycles-pp.down_write_killable
0.18 ± 3% -0.0 0.17 ± 5% -0.0 0.16 ± 5% perf-profile.children.cycles-pp.cap_vm_enough_memory
0.22 ± 4% -0.0 0.20 ± 3% -0.0 0.20 ± 3% perf-profile.children.cycles-pp.__rmqueue_pcplist
0.21 ± 4% -0.0 0.19 ± 3% -0.0 0.19 ± 3% perf-profile.children.cycles-pp.rmqueue_bulk
0.52 -0.0 0.51 ± 2% -0.0 0.50 perf-profile.children.cycles-pp.__pte_offset_map
0.26 -0.0 0.24 ± 2% -0.0 0.24 ± 2% perf-profile.children.cycles-pp.syscall_exit_to_user_mode
0.30 ± 2% -0.0 0.28 ± 2% -0.0 0.28 ± 3% perf-profile.children.cycles-pp.__vm_enough_memory
0.29 -0.0 0.27 -0.0 0.27 ± 3% perf-profile.children.cycles-pp.tlb_gather_mmu
0.16 ± 2% -0.0 0.14 ± 3% -0.0 0.14 ± 4% perf-profile.children.cycles-pp.mas_wr_append
0.28 ± 2% -0.0 0.26 -0.0 0.26 perf-profile.children.cycles-pp.khugepaged_enter_vma
0.32 -0.0 0.30 -0.0 0.30 perf-profile.children.cycles-pp.mas_wr_store_setup
0.20 ± 2% -0.0 0.18 ± 2% -0.0 0.18 ± 2% perf-profile.children.cycles-pp.__thp_vma_allowable_orders
0.32 -0.0 0.30 -0.0 0.30 ± 2% perf-profile.children.cycles-pp.pte_offset_map_nolock
0.09 ± 4% -0.0 0.08 ± 5% -0.0 0.07 ± 6% perf-profile.children.cycles-pp.vma_dup_policy
0.36 -0.0 0.35 -0.0 0.35 perf-profile.children.cycles-pp.madvise_vma_behavior
0.16 ± 3% -0.0 0.16 ± 2% -0.0 0.15 ± 3% perf-profile.children.cycles-pp._find_next_bit
0.14 ± 3% +0.0 0.15 ± 2% +0.0 0.15 perf-profile.children.cycles-pp.free_pgd_range
0.08 ± 4% +0.0 0.10 ± 4% +0.0 0.10 ± 4% perf-profile.children.cycles-pp.mm_get_unmapped_area_vmflags
0.78 +0.1 0.85 +0.1 0.85 perf-profile.children.cycles-pp.__madvise
0.63 +0.1 0.71 +0.1 0.71 perf-profile.children.cycles-pp.__x64_sys_madvise
0.63 +0.1 0.70 +0.1 0.70 perf-profile.children.cycles-pp.do_madvise
3.52 +0.1 3.60 +0.1 3.61 perf-profile.children.cycles-pp.free_pgtables
0.00 +0.1 0.09 +0.1 0.09 ± 4% perf-profile.children.cycles-pp.can_modify_mm_madv
1.30 +0.2 1.46 +0.2 1.46 perf-profile.children.cycles-pp.mas_next_slot
88.06 +0.8 88.84 +0.6 88.64 perf-profile.children.cycles-pp.mremap
83.81 +1.0 84.84 +0.8 84.65 perf-profile.children.cycles-pp.__do_sys_mremap
85.98 +1.0 87.02 +0.8 86.82 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
85.50 +1.1 86.56 +0.9 86.36 perf-profile.children.cycles-pp.do_syscall_64
2.12 +1.5 3.62 +1.5 3.63 perf-profile.children.cycles-pp.do_munmap
40.41 +1.5 41.93 +1.5 41.86 perf-profile.children.cycles-pp.do_vmi_munmap
3.62 +2.4 5.98 +2.3 5.95 perf-profile.children.cycles-pp.mas_walk
5.40 +3.0 8.44 +3.0 8.43 perf-profile.children.cycles-pp.mremap_to
5.26 +3.2 8.48 +3.2 8.47 perf-profile.children.cycles-pp.mas_find
0.00 +5.5 5.46 +5.4 5.45 perf-profile.children.cycles-pp.can_modify_mm
11.49 -0.6 10.92 -0.6 10.93 perf-profile.self.cycles-pp.__slab_free
4.32 -0.2 4.07 -0.3 3.98 ± 2% perf-profile.self.cycles-pp.__memcg_slab_post_alloc_hook
1.96 -0.2 1.80 ± 4% -0.1 1.83 ± 2% perf-profile.self.cycles-pp.__memcpy
2.36 ± 2% -0.1 2.24 ± 2% -0.2 2.19 ± 2% perf-profile.self.cycles-pp.down_write
2.42 -0.1 2.30 -0.1 2.32 perf-profile.self.cycles-pp.rcu_cblist_dequeue
2.33 -0.1 2.22 -0.1 2.23 perf-profile.self.cycles-pp.mtree_load
2.21 -0.1 2.10 -0.1 2.10 perf-profile.self.cycles-pp.native_flush_tlb_one_user
1.62 -0.1 1.54 -0.1 1.55 ± 2% perf-profile.self.cycles-pp.__memcg_slab_free_hook
1.52 -0.1 1.44 -0.1 1.44 perf-profile.self.cycles-pp.mas_wr_walk
1.15 ± 2% -0.1 1.07 -0.1 1.08 perf-profile.self.cycles-pp.shuffle_freelist
1.53 -0.1 1.45 -0.1 1.47 ± 2% perf-profile.self.cycles-pp.up_write
1.44 -0.1 1.36 -0.1 1.36 perf-profile.self.cycles-pp.__call_rcu_common
0.70 ± 2% -0.1 0.62 -0.1 0.63 perf-profile.self.cycles-pp.rcu_all_qs
1.72 -0.1 1.66 -0.1 1.66 perf-profile.self.cycles-pp.mod_objcg_state
3.77 -0.1 3.70 ± 4% -0.2 3.62 ± 2% perf-profile.self.cycles-pp.mas_wr_node_store
0.51 ± 3% -0.1 0.45 -0.1 0.45 perf-profile.self.cycles-pp.security_mmap_addr
0.94 ± 2% -0.1 0.88 ± 4% -0.1 0.88 ± 2% perf-profile.self.cycles-pp.vm_area_dup
1.18 -0.1 1.12 -0.1 1.12 perf-profile.self.cycles-pp.vma_merge
1.38 -0.1 1.33 -0.1 1.32 perf-profile.self.cycles-pp.do_vmi_align_munmap
0.89 -0.1 0.83 -0.1 0.83 perf-profile.self.cycles-pp.___slab_alloc
0.62 -0.1 0.56 ± 2% -0.1 0.56 ± 2% perf-profile.self.cycles-pp.mremap
1.00 -0.1 0.95 -0.1 0.95 perf-profile.self.cycles-pp.mas_preallocate
0.98 -0.1 0.93 -0.0 0.93 perf-profile.self.cycles-pp.move_ptes
0.99 -0.1 0.94 -0.1 0.93 perf-profile.self.cycles-pp.mas_prev_slot
1.09 -0.0 1.04 ± 2% -0.0 1.05 perf-profile.self.cycles-pp.__cond_resched
0.94 -0.0 0.90 -0.1 0.89 perf-profile.self.cycles-pp.vm_area_free_rcu_cb
0.85 -0.0 0.80 -0.0 0.81 perf-profile.self.cycles-pp.mas_pop_node
0.77 -0.0 0.72 -0.0 0.73 perf-profile.self.cycles-pp.percpu_counter_add_batch
0.68 -0.0 0.63 -0.0 0.64 perf-profile.self.cycles-pp.__split_vma
1.17 -0.0 1.13 -0.1 1.12 perf-profile.self.cycles-pp.clear_bhb_loop
0.95 -0.0 0.91 -0.0 0.91 perf-profile.self.cycles-pp.mas_leaf_max_gap
0.79 -0.0 0.75 -0.0 0.76 perf-profile.self.cycles-pp.mas_wr_store_entry
0.44 -0.0 0.40 -0.0 0.40 perf-profile.self.cycles-pp.do_munmap
1.22 -0.0 1.18 -0.0 1.19 perf-profile.self.cycles-pp.move_vma
0.89 -0.0 0.86 -0.0 0.86 perf-profile.self.cycles-pp.mas_store_gfp
0.45 -0.0 0.42 -0.0 0.42 perf-profile.self.cycles-pp.mas_wr_end_piv
0.43 ± 2% -0.0 0.40 -0.0 0.39 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
0.78 -0.0 0.75 -0.0 0.76 perf-profile.self.cycles-pp.userfaultfd_unmap_complete
0.66 -0.0 0.63 -0.0 0.63 perf-profile.self.cycles-pp.mas_store_prealloc
1.49 -0.0 1.46 -0.0 1.45 perf-profile.self.cycles-pp.kmem_cache_free
0.60 -0.0 0.58 -0.0 0.58 perf-profile.self.cycles-pp.unmap_region
0.86 -0.0 0.83 -0.0 0.83 perf-profile.self.cycles-pp.move_page_tables
0.43 ± 4% -0.0 0.40 -0.0 0.42 ± 4% perf-profile.self.cycles-pp.anon_vma_interval_tree_insert
0.99 -0.0 0.97 -0.0 0.97 perf-profile.self.cycles-pp.mt_find
0.36 ± 3% -0.0 0.33 ± 2% -0.0 0.33 ± 2% perf-profile.self.cycles-pp.syscall_return_via_sysret
0.71 -0.0 0.68 -0.0 0.68 perf-profile.self.cycles-pp.unmap_page_range
0.55 -0.0 0.52 -0.0 0.51 perf-profile.self.cycles-pp.get_old_pud
0.49 -0.0 0.47 -0.0 0.47 perf-profile.self.cycles-pp.find_vma_prev
0.27 -0.0 0.25 -0.0 0.25 ± 2% perf-profile.self.cycles-pp.mas_prev_setup
0.41 -0.0 0.39 -0.0 0.39 perf-profile.self.cycles-pp._raw_spin_lock_irqsave
0.61 -0.0 0.58 -0.0 0.58 perf-profile.self.cycles-pp.copy_vma
0.47 -0.0 0.45 ± 2% -0.0 0.45 perf-profile.self.cycles-pp.flush_tlb_mm_range
0.37 ± 6% -0.0 0.35 ± 2% -0.0 0.35 perf-profile.self.cycles-pp._raw_spin_lock
0.42 ± 2% -0.0 0.40 ± 2% -0.0 0.40 perf-profile.self.cycles-pp.rcu_segcblist_enqueue
0.27 -0.0 0.25 ± 2% -0.0 0.25 ± 2% perf-profile.self.cycles-pp.mas_put_in_tree
0.39 -0.0 0.37 -0.0 0.37 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
0.44 -0.0 0.42 -0.0 0.42 perf-profile.self.cycles-pp.mas_update_gap
0.49 -0.0 0.47 -0.0 0.48 ± 2% perf-profile.self.cycles-pp.refill_obj_stock
0.27 ± 2% -0.0 0.25 ± 2% -0.0 0.25 perf-profile.self.cycles-pp.tlb_finish_mmu
0.34 -0.0 0.32 -0.0 0.32 ± 2% perf-profile.self.cycles-pp.zap_pmd_range
0.48 -0.0 0.46 -0.0 0.46 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
0.28 -0.0 0.26 -0.0 0.26 perf-profile.self.cycles-pp.mas_alloc_nodes
0.24 ± 2% -0.0 0.22 -0.0 0.22 perf-profile.self.cycles-pp.mas_prev
0.14 ± 3% -0.0 0.12 ± 2% -0.0 0.12 ± 2% perf-profile.self.cycles-pp.syscall_exit_to_user_mode
0.26 -0.0 0.24 -0.0 0.24 perf-profile.self.cycles-pp.__rb_insert_augmented
0.40 -0.0 0.39 -0.0 0.39 perf-profile.self.cycles-pp.__pte_offset_map_lock
0.28 -0.0 0.26 ± 3% -0.0 0.26 perf-profile.self.cycles-pp.mas_prev_range
0.33 ± 2% -0.0 0.32 -0.0 0.31 perf-profile.self.cycles-pp.zap_pte_range
0.28 -0.0 0.26 -0.0 0.26 perf-profile.self.cycles-pp.flush_tlb_func
0.44 -0.0 0.42 ± 2% -0.0 0.42 perf-profile.self.cycles-pp.__pte_offset_map
0.22 -0.0 0.21 ± 2% -0.0 0.21 ± 2% perf-profile.self.cycles-pp.entry_SYSCALL_64
0.17 -0.0 0.16 -0.0 0.16 perf-profile.self.cycles-pp.__thp_vma_allowable_orders
0.10 -0.0 0.09 -0.0 0.09 ± 3% perf-profile.self.cycles-pp.mod_node_page_state
0.06 -0.0 0.05 -0.0 0.05 perf-profile.self.cycles-pp.vma_dup_policy
0.06 ± 5% +0.0 0.07 +0.0 0.07 ± 4% perf-profile.self.cycles-pp.mm_get_unmapped_area_vmflags
0.11 ± 4% +0.0 0.12 ± 4% +0.0 0.12 ± 2% perf-profile.self.cycles-pp.free_pgd_range
0.21 +0.0 0.22 ± 2% +0.0 0.22 perf-profile.self.cycles-pp.thp_get_unmapped_area_vmflags
0.45 +0.0 0.48 +0.0 0.48 perf-profile.self.cycles-pp.do_vmi_munmap
0.27 +0.0 0.32 +0.0 0.31 perf-profile.self.cycles-pp.free_pgtables
0.36 ± 2% +0.1 0.44 +0.1 0.45 perf-profile.self.cycles-pp.unlink_anon_vmas
1.06 +0.1 1.19 +0.1 1.19 perf-profile.self.cycles-pp.mas_next_slot
1.49 +0.5 2.01 +0.5 2.02 perf-profile.self.cycles-pp.mas_find
0.00 +1.4 1.38 +1.4 1.38 perf-profile.self.cycles-pp.can_modify_mm
3.15 +2.1 5.23 +2.1 5.22 perf-profile.self.cycles-pp.mas_walk
>
> Linus
> arch/powerpc/include/asm/mmu_context.h | 9 ---------
> arch/powerpc/kernel/vdso.c | 17 +++++++++++++++--
> arch/x86/include/asm/mmu_context.h | 5 -----
> include/asm-generic/mm_hooks.h | 11 +++--------
> include/linux/mm_types.h | 2 ++
> mm/mmap.c | 15 ++++++---------
> 6 files changed, 26 insertions(+), 33 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
> index 37bffa0f7918..a334a1368848 100644
> --- a/arch/powerpc/include/asm/mmu_context.h
> +++ b/arch/powerpc/include/asm/mmu_context.h
> @@ -260,15 +260,6 @@ static inline void enter_lazy_tlb(struct mm_struct *mm,
>
> extern void arch_exit_mmap(struct mm_struct *mm);
>
> -static inline void arch_unmap(struct mm_struct *mm,
> - unsigned long start, unsigned long end)
> -{
> - unsigned long vdso_base = (unsigned long)mm->context.vdso;
> -
> - if (start <= vdso_base && vdso_base < end)
> - mm->context.vdso = NULL;
> -}
> -
> #ifdef CONFIG_PPC_MEM_KEYS
> bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write,
> bool execute, bool foreign);
> diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c
> index 7a2ff9010f17..6fa041a6690a 100644
> --- a/arch/powerpc/kernel/vdso.c
> +++ b/arch/powerpc/kernel/vdso.c
> @@ -81,6 +81,13 @@ static int vdso64_mremap(const struct vm_special_mapping *sm, struct vm_area_str
> return vdso_mremap(sm, new_vma, &vdso64_end - &vdso64_start);
> }
>
> +static int vvar_close(const struct vm_special_mapping *sm,
> + struct vm_area_struct *vma)
> +{
> + struct mm_struct *mm = vma->vm_mm;
> + mm->context.vdso = NULL;
> +}
> +
> static vm_fault_t vvar_fault(const struct vm_special_mapping *sm,
> struct vm_area_struct *vma, struct vm_fault *vmf);
>
> @@ -92,11 +99,13 @@ static struct vm_special_mapping vvar_spec __ro_after_init = {
> static struct vm_special_mapping vdso32_spec __ro_after_init = {
> .name = "[vdso]",
> .mremap = vdso32_mremap,
> + .close = vvar_close,
> };
>
> static struct vm_special_mapping vdso64_spec __ro_after_init = {
> .name = "[vdso]",
> .mremap = vdso64_mremap,
> + .close = vvar_close,
> };
>
> #ifdef CONFIG_TIME_NS
> @@ -207,8 +216,10 @@ static int __arch_setup_additional_pages(struct linux_binprm *bprm, int uses_int
> vma = _install_special_mapping(mm, vdso_base, vvar_size,
> VM_READ | VM_MAYREAD | VM_IO |
> VM_DONTDUMP | VM_PFNMAP, &vvar_spec);
> - if (IS_ERR(vma))
> + if (IS_ERR(vma)) {
> + mm->context.vdso = NULL;
> return PTR_ERR(vma);
> + }
>
> /*
> * our vma flags don't have VM_WRITE so by default, the process isn't
> @@ -223,8 +234,10 @@ static int __arch_setup_additional_pages(struct linux_binprm *bprm, int uses_int
> vma = _install_special_mapping(mm, vdso_base + vvar_size, vdso_size,
> VM_READ | VM_EXEC | VM_MAYREAD |
> VM_MAYWRITE | VM_MAYEXEC, vdso_spec);
> - if (IS_ERR(vma))
> + if (IS_ERR(vma)) {
> + mm->context.vdso = NULL;
> do_munmap(mm, vdso_base, vvar_size, NULL);
> + }
>
> return PTR_ERR_OR_ZERO(vma);
> }
> diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
> index 8dac45a2c7fc..80f2a3187aa6 100644
> --- a/arch/x86/include/asm/mmu_context.h
> +++ b/arch/x86/include/asm/mmu_context.h
> @@ -232,11 +232,6 @@ static inline bool is_64bit_mm(struct mm_struct *mm)
> }
> #endif
>
> -static inline void arch_unmap(struct mm_struct *mm, unsigned long start,
> - unsigned long end)
> -{
> -}
> -
> /*
> * We only want to enforce protection keys on the current process
> * because we effectively have no access to PKRU for other
> diff --git a/include/asm-generic/mm_hooks.h b/include/asm-generic/mm_hooks.h
> index 4dbb177d1150..6eea3b3c1e65 100644
> --- a/include/asm-generic/mm_hooks.h
> +++ b/include/asm-generic/mm_hooks.h
> @@ -1,8 +1,8 @@
> /* SPDX-License-Identifier: GPL-2.0 */
> /*
> - * Define generic no-op hooks for arch_dup_mmap, arch_exit_mmap
> - * and arch_unmap to be included in asm-FOO/mmu_context.h for any
> - * arch FOO which doesn't need to hook these.
> + * Define generic no-op hooks for arch_dup_mmap and arch_exit_mmap
> + * to be included in asm-FOO/mmu_context.h for any arch FOO which
> + * doesn't need to hook these.
> */
> #ifndef _ASM_GENERIC_MM_HOOKS_H
> #define _ASM_GENERIC_MM_HOOKS_H
> @@ -17,11 +17,6 @@ static inline void arch_exit_mmap(struct mm_struct *mm)
> {
> }
>
> -static inline void arch_unmap(struct mm_struct *mm,
> - unsigned long start, unsigned long end)
> -{
> -}
> -
> static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
> bool write, bool execute, bool foreign)
> {
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 485424979254..ef32d87a3adc 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -1313,6 +1313,8 @@ struct vm_special_mapping {
>
> int (*mremap)(const struct vm_special_mapping *sm,
> struct vm_area_struct *new_vma);
> + void (*close)(const struct vm_special_mapping *sm,
> + struct vm_area_struct *vma);
> };
>
> enum tlb_flush_reason {
> diff --git a/mm/mmap.c b/mm/mmap.c
> index d0dfc85b209b..adaaf1ef197a 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -2789,7 +2789,7 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
> *
> * This function takes a @mas that is either pointing to the previous VMA or set
> * to MA_START and sets it up to remove the mapping(s). The @len will be
> - * aligned and any arch_unmap work will be preformed.
> + * aligned.
> *
> * Return: 0 on success and drops the lock if so directed, error and leaves the
> * lock held otherwise.
> @@ -2809,16 +2809,12 @@ int do_vmi_munmap(struct vma_iterator *vmi, struct mm_struct *mm,
> return -EINVAL;
>
> /*
> - * Check if memory is sealed before arch_unmap.
> - * Prevent unmapping a sealed VMA.
> + * Check if memory is sealed, prevent unmapping a sealed VMA.
> * can_modify_mm assumes we have acquired the lock on MM.
> */
> if (unlikely(!can_modify_mm(mm, start, end)))
> return -EPERM;
>
> - /* arch_unmap() might do unmaps itself. */
> - arch_unmap(mm, start, end);
> -
> /* Find the first overlapping VMA */
> vma = vma_find(vmi, end);
> if (!vma) {
> @@ -3232,14 +3228,12 @@ int do_vma_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
> struct mm_struct *mm = vma->vm_mm;
>
> /*
> - * Check if memory is sealed before arch_unmap.
> - * Prevent unmapping a sealed VMA.
> + * Check if memory is sealed, prevent unmapping a sealed VMA.
> * can_modify_mm assumes we have acquired the lock on MM.
> */
> if (unlikely(!can_modify_mm(mm, start, end)))
> return -EPERM;
>
> - arch_unmap(mm, start, end);
> return do_vmi_align_munmap(vmi, vma, mm, start, end, uf, unlock);
> }
>
> @@ -3624,6 +3618,9 @@ static vm_fault_t special_mapping_fault(struct vm_fault *vmf);
> */
> static void special_mapping_close(struct vm_area_struct *vma)
> {
> + const struct vm_special_mapping *sm = vma->vm_private_data;
> + if (sm->close)
> + sm->close(sm, vma);
> }
>
> static const char *special_mapping_name(struct vm_area_struct *vma)
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression
2024-08-06 2:17 ` Linus Torvalds
@ 2024-08-06 12:03 ` Michael Ellerman
2024-08-06 14:43 ` Linus Torvalds
0 siblings, 1 reply; 29+ messages in thread
From: Michael Ellerman @ 2024-08-06 12:03 UTC (permalink / raw)
To: Linus Torvalds
Cc: Jeff Xu, Nicholas Piggin, Christophe Leroy, Pedro Falcato,
kernel test robot, Jeff Xu, oe-lkp, lkp, linux-kernel,
Andrew Morton, Kees Cook, Liam R. Howlett, Dave Hansen,
Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jonathan Corbet,
Jorge Lucangeli Obes, Matthew Wilcox, Muhammad Usama Anjum,
Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany,
Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang,
feng.tang, fengwei.yin
Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Mon, 5 Aug 2024 at 19:14, Michael Ellerman <mpe@ellerman.id.au> wrote:
>>
>> Needs a slight tweak to compile, vvar_close() needs to return void.
>
> Ack, shows just how untested it was.
>
>> And should probably be renamed vdso_close().
>
> .. and that was due to the initial confusion that I then fixed, but
> didn't fix the naming.
Ack.
> So yes, those fixes look ObviouslyCorrect(tm).
Needs another slight tweak to work correctly. Diff below.
With that our sigreturn_vdso selftest passes, and the CRIU vdso tests
pass also. So LGTM.
I'm not sure of the urgency on this, do you want to apply it directly?
If so feel free to add my tested-by/sob etc.
Or should I turn it into a series and post it?
cheers
diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c
index 431b46976db8..ed5ac4af4d83 100644
--- a/arch/powerpc/kernel/vdso.c
+++ b/arch/powerpc/kernel/vdso.c
@@ -85,6 +85,15 @@ static void vdso_close(const struct vm_special_mapping *sm,
struct vm_area_struct *vma)
{
struct mm_struct *mm = vma->vm_mm;
+
+ /*
+ * close() is called for munmap() but also for mremap(). In the mremap()
+ * case the vdso pointer has already been updated by the mremap() hook
+ * above, so it must not be set to NULL here.
+ */
+ if (vma->vm_start != (unsigned long)mm->context.vdso)
+ return;
+
mm->context.vdso = NULL;
}
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression
2024-08-06 6:04 ` Oliver Sang
@ 2024-08-06 14:38 ` Linus Torvalds
2024-08-06 21:37 ` Pedro Falcato
1 sibling, 0 replies; 29+ messages in thread
From: Linus Torvalds @ 2024-08-06 14:38 UTC (permalink / raw)
To: Oliver Sang
Cc: Jeff Xu, Michael Ellerman, Nicholas Piggin, Christophe Leroy,
Pedro Falcato, Jeff Xu, oe-lkp, lkp, linux-kernel, Andrew Morton,
Kees Cook, Liam R. Howlett, Dave Hansen, Greg Kroah-Hartman,
Guenter Roeck, Jann Horn, Jonathan Corbet, Jorge Lucangeli Obes,
Matthew Wilcox, Muhammad Usama Anjum, Stephen Röttger,
Suren Baghdasaryan, Amer Al Shanawany, Javier Carrasco,
Shuah Khan, linux-api, linux-mm, ying.huang, feng.tang,
fengwei.yin
On Mon, 5 Aug 2024 at 23:05, Oliver Sang <oliver.sang@intel.com> wrote:
>
> > New version - still untested, but now I've read through it one more
> > time - attached.
>
> we tested this version by applying it directly upon 8be7258aad, but seems it
> have little impact to performance. still similar regression if comparing to
> ff388fe5c4.
Note that that patch (and Michael's fixes for ppc on top) in itself
doesn't fix any performance issue.
But getting rid of arch_unmap() means that now the can_modify_mm() in
do_vmi_munmap() is right above the "vma_find()" (and can in fact be
moved below it and into do_vmi_align_munmap), and that means that at
least the unmap paths don't need the vma lookup of can_modify_mm() at
all, because they've done their own.
IOW, the "arch_unmap()" removal was purely preparatory and did nothing
on its own, it's only preparatory to get rid of some of the
can_modify_mm() costs.
The call to can_modify_mm() in mremap_to() is a bit harder to get rid
of. Unless we just say "mremap will unmap the destination even if the
mremap source is sealed".
Linus
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression
2024-08-06 12:03 ` Michael Ellerman
@ 2024-08-06 14:43 ` Linus Torvalds
2024-08-07 12:26 ` Michael Ellerman
0 siblings, 1 reply; 29+ messages in thread
From: Linus Torvalds @ 2024-08-06 14:43 UTC (permalink / raw)
To: Michael Ellerman
Cc: Jeff Xu, Nicholas Piggin, Christophe Leroy, Pedro Falcato,
kernel test robot, Jeff Xu, oe-lkp, lkp, linux-kernel,
Andrew Morton, Kees Cook, Liam R. Howlett, Dave Hansen,
Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jonathan Corbet,
Jorge Lucangeli Obes, Matthew Wilcox, Muhammad Usama Anjum,
Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany,
Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang,
feng.tang, fengwei.yin
On Tue, 6 Aug 2024 at 05:03, Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> Or should I turn it into a series and post it?
I think post it as a single working patch rather than as a series that
breaks things and then fixes it.
And considering that you did all the testing and found the problems,
just take ownership of it and make it a "Suggested-by: Linus" or
something.
That's what my original patch was anyway: "something like this".
Linus
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression
2024-08-06 1:44 ` Oliver Sang
@ 2024-08-06 14:54 ` Jeff Xu
0 siblings, 0 replies; 29+ messages in thread
From: Jeff Xu @ 2024-08-06 14:54 UTC (permalink / raw)
To: Oliver Sang
Cc: Jeff Xu, oe-lkp, lkp, linux-kernel, Andrew Morton, Kees Cook,
Liam R. Howlett, Pedro Falcato, Dave Hansen, Greg Kroah-Hartman,
Guenter Roeck, Jann Horn, Jonathan Corbet, Jorge Lucangeli Obes,
Linus Torvalds, Matthew Wilcox, Muhammad Usama Anjum,
Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany,
Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang,
feng.tang, fengwei.yin
On Mon, Aug 5, 2024 at 6:44 PM Oliver Sang <oliver.sang@intel.com> wrote:
>
> hi, Jeff,
>
> On Mon, Aug 05, 2024 at 09:58:33AM -0700, Jeff Xu wrote:
> > On Sun, Aug 4, 2024 at 1:59 AM kernel test robot <oliver.sang@intel.com> wrote:
> > >
> > >
> > >
> > > Hello,
> > >
> > > kernel test robot noticed a -4.4% regression of stress-ng.pagemove.page_remaps_per_sec on:
> > >
> > >
> > > commit: 8be7258aad44b5e25977a98db136f677fa6f4370 ("mseal: add mseal syscall")
> > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > >
> > > testcase: stress-ng
> > > test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
> > > parameters:
> > >
> > > nr_threads: 100%
> > > testtime: 60s
> > > test: pagemove
> > > cpufreq_governor: performance
> > >
> > >
> > > In addition to that, the commit also has significant impact on the following tests:
> > >
> > > +------------------+---------------------------------------------------------------------------------------------+
> > > | testcase: change | stress-ng: stress-ng.pkey.ops_per_sec -3.6% regression |
> > > | test machine | 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory |
> > > | test parameters | cpufreq_governor=performance |
> > > | | nr_threads=100% |
> > > | | test=pkey |
> > > | | testtime=60s |
> > > +------------------+---------------------------------------------------------------------------------------------+
> > >
> > >
> > > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > > the same patch/commit), kindly add following tags
> > > | Reported-by: kernel test robot <oliver.sang@intel.com>
> > > | Closes: https://lore.kernel.org/oe-lkp/202408041602.caa0372-oliver.sang@intel.com
> > >
> > >
> > > Details are as below:
> > > -------------------------------------------------------------------------------------------------->
> > >
> > >
> > > The kernel config and materials to reproduce are available at:
> > > https://download.01.org/0day-ci/archive/20240804/202408041602.caa0372-oliver.sang@intel.com
> > >
> > There is an error when I try to reproduce the test:
>
> what's your os? we support some distributions
> https://github.com/intel/lkp-tests?tab=readme-ov-file#supported-distributions
>
> >
> > bin/lkp install job.yaml
> >
> > --------------------------------------------------------
> > Some packages could not be installed. This may mean that you have
> > requested an impossible situation or if you are using the unstable
> > distribution that some required packages have not yet been created
> > or been moved out of Incoming.
> > The following information may help to resolve the situation:
> >
> > The following packages have unmet dependencies:
> > libdw1 : Depends: libelf1 (= 0.190-1+b1)
> > libdw1t64 : Breaks: libdw1 (< 0.191-2)
> > E: Unable to correct problems, you have held broken packages.
> > Cannot install some packages of perf-c2c depends
> > -----------------------------------------------------------------------------------------
> >
> > And where is stress-ng.pagemove.page_remaps_per_sec test implemented,
> > is that part of lkp-tests ?
>
> stress-ng is in https://github.com/ColinIanKing/stress-ng
>
I will try this route first.
Thanks
-Jeff
> >
> > Thanks
> > -Jeff
> >
> > > =========================================================================================
> > > compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
> > > gcc-13/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/pagemove/stress-ng/60s
> > >
> > > commit:
> > > ff388fe5c4 ("mseal: wire up mseal syscall")
> > > 8be7258aad ("mseal: add mseal syscall")
> > >
> > > ff388fe5c481d39c 8be7258aad44b5e25977a98db13
> > > ---------------- ---------------------------
> > > %stddev %change %stddev
> > > \ | \
> > > 41625945 -4.3% 39842322 proc-vmstat.numa_hit
> > > 41559175 -4.3% 39774160 proc-vmstat.numa_local
> > > 77484314 -4.4% 74105555 proc-vmstat.pgalloc_normal
> > > 77205752 -4.4% 73826672 proc-vmstat.pgfree
> > > 18361466 -4.2% 17596652 stress-ng.pagemove.ops
> > > 306014 -4.2% 293262 stress-ng.pagemove.ops_per_sec
> > > 205312 -4.4% 196176 stress-ng.pagemove.page_remaps_per_sec
> > > 4961 +1.0% 5013 stress-ng.time.percent_of_cpu_this_job_got
> > > 2917 +1.2% 2952 stress-ng.time.system_time
> > > 1.07 -6.6% 1.00 perf-stat.i.MPKI
> > > 3.354e+10 +3.5% 3.473e+10 perf-stat.i.branch-instructions
> > > 1.795e+08 -4.2% 1.719e+08 perf-stat.i.cache-misses
> > > 2.376e+08 -4.1% 2.279e+08 perf-stat.i.cache-references
> > > 1.13 -3.0% 1.10 perf-stat.i.cpi
> > > 1077 +4.3% 1124 perf-stat.i.cycles-between-cache-misses
> > > 1.717e+11 +2.7% 1.762e+11 perf-stat.i.instructions
> > > 0.88 +3.1% 0.91 perf-stat.i.ipc
> > > 1.05 -6.8% 0.97 perf-stat.overall.MPKI
> > > 0.25 ą 2% -0.0 0.24 perf-stat.overall.branch-miss-rate%
> > > 1.13 -3.0% 1.10 perf-stat.overall.cpi
> > > 1084 +4.0% 1127 perf-stat.overall.cycles-between-cache-misses
> > > 0.88 +3.1% 0.91 perf-stat.overall.ipc
> > > 3.298e+10 +3.5% 3.415e+10 perf-stat.ps.branch-instructions
> > > 1.764e+08 -4.3% 1.689e+08 perf-stat.ps.cache-misses
> > > 2.336e+08 -4.1% 2.24e+08 perf-stat.ps.cache-references
> > > 194.57 -2.4% 189.96 ą 2% perf-stat.ps.cpu-migrations
> > > 1.688e+11 +2.7% 1.733e+11 perf-stat.ps.instructions
> > > 1.036e+13 +3.0% 1.068e+13 perf-stat.total.instructions
> > > 75.12 -1.9 73.22 perf-profile.calltrace.cycles-pp.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
> > > 36.84 -1.6 35.29 perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64
> > > 24.90 -1.2 23.72 perf-profile.calltrace.cycles-pp.copy_vma.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > > 19.89 -0.9 18.98 perf-profile.calltrace.cycles-pp.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
> > > 10.56 ą 2% -0.8 9.78 ą 2% perf-profile.calltrace.cycles-pp.rcu_core.handle_softirqs.run_ksoftirqd.smpboot_thread_fn.kthread
> > > 10.56 ą 2% -0.8 9.79 ą 2% perf-profile.calltrace.cycles-pp.handle_softirqs.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork
> > > 10.56 ą 2% -0.8 9.79 ą 2% perf-profile.calltrace.cycles-pp.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
> > > 10.57 ą 2% -0.8 9.80 ą 2% perf-profile.calltrace.cycles-pp.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
> > > 10.52 ą 2% -0.8 9.75 ą 2% perf-profile.calltrace.cycles-pp.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd.smpboot_thread_fn
> > > 10.62 ą 2% -0.8 9.85 ą 2% perf-profile.calltrace.cycles-pp.kthread.ret_from_fork.ret_from_fork_asm
> > > 10.62 ą 2% -0.8 9.85 ą 2% perf-profile.calltrace.cycles-pp.ret_from_fork.ret_from_fork_asm
> > > 10.62 ą 2% -0.8 9.85 ą 2% perf-profile.calltrace.cycles-pp.ret_from_fork_asm
> > > 14.75 -0.7 14.07 perf-profile.calltrace.cycles-pp.vma_merge.copy_vma.move_vma.__do_sys_mremap.do_syscall_64
> > > 1.50 -0.6 0.94 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64
> > > 5.88 ą 2% -0.4 5.47 ą 2% perf-profile.calltrace.cycles-pp.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd
> > > 7.80 -0.3 7.47 perf-profile.calltrace.cycles-pp.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > > 4.55 ą 2% -0.3 4.24 ą 2% perf-profile.calltrace.cycles-pp.__slab_free.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs
> > > 6.76 -0.3 6.45 perf-profile.calltrace.cycles-pp.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma
> > > 6.15 -0.3 5.86 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap
> > > 8.22 -0.3 7.93 perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
> > > 6.12 -0.3 5.87 perf-profile.calltrace.cycles-pp.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma
> > > 5.74 -0.2 5.50 perf-profile.calltrace.cycles-pp.move_ptes.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64
> > > 3.16 ą 2% -0.2 2.94 perf-profile.calltrace.cycles-pp.vm_area_free_rcu_cb.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd
> > > 5.50 -0.2 5.28 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap
> > > 1.36 -0.2 1.14 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap
> > > 5.15 -0.2 4.94 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap
> > > 5.51 -0.2 5.31 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_merge.copy_vma.move_vma.__do_sys_mremap
> > > 5.16 -0.2 4.97 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma.move_vma
> > > 2.24 -0.2 2.05 perf-profile.calltrace.cycles-pp.find_vma_prev.copy_vma.move_vma.__do_sys_mremap.do_syscall_64
> > > 2.60 ą 2% -0.2 2.42 ą 2% perf-profile.calltrace.cycles-pp.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core.handle_softirqs
> > > 4.67 -0.2 4.49 perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma
> > > 3.41 -0.2 3.23 perf-profile.calltrace.cycles-pp.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma
> > > 3.00 -0.2 2.83 ą 2% perf-profile.calltrace.cycles-pp.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma
> > > 0.96 -0.2 0.80 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.do_munmap.mremap_to
> > > 4.04 -0.2 3.88 perf-profile.calltrace.cycles-pp.vm_area_dup.copy_vma.move_vma.__do_sys_mremap.do_syscall_64
> > > 3.20 ą 2% -0.2 3.04 ą 2% perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap
> > > 3.53 -0.1 3.38 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma.__do_sys_mremap
> > > 3.40 -0.1 3.26 perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma.__do_sys_mremap
> > > 2.20 ą 2% -0.1 2.06 ą 2% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap
> > > 1.84 ą 3% -0.1 1.71 ą 3% perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.anon_vma_clone.__split_vma.do_vmi_align_munmap
> > > 1.78 ą 2% -0.1 1.65 ą 3% perf-profile.calltrace.cycles-pp.vma_prepare.vma_merge.copy_vma.move_vma.__do_sys_mremap
> > > 2.69 -0.1 2.56 perf-profile.calltrace.cycles-pp.unlink_anon_vmas.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap
> > > 1.78 ą 2% -0.1 1.66 ą 2% perf-profile.calltrace.cycles-pp.__slab_free.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core
> > > 1.36 ą 2% -0.1 1.23 ą 2% perf-profile.calltrace.cycles-pp.rcu_cblist_dequeue.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd
> > > 0.95 -0.1 0.83 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap
> > > 3.29 -0.1 3.17 perf-profile.calltrace.cycles-pp.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
> > > 2.08 -0.1 1.96 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_merge.copy_vma.move_vma.__do_sys_mremap
> > > 1.43 ą 3% -0.1 1.32 ą 3% perf-profile.calltrace.cycles-pp.down_write.vma_prepare.vma_merge.copy_vma.move_vma
> > > 2.21 -0.1 2.10 perf-profile.calltrace.cycles-pp.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma
> > > 2.47 -0.1 2.36 perf-profile.calltrace.cycles-pp.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma
> > > 2.21 -0.1 2.12 perf-profile.calltrace.cycles-pp.native_flush_tlb_one_user.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables
> > > 1.41 -0.1 1.32 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap
> > > 1.26 -0.1 1.18 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap
> > > 1.82 -0.1 1.75 perf-profile.calltrace.cycles-pp.vma_link.copy_vma.move_vma.__do_sys_mremap.do_syscall_64
> > > 0.71 -0.1 0.63 perf-profile.calltrace.cycles-pp.mas_prev.vma_merge.copy_vma.move_vma.__do_sys_mremap
> > > 1.29 -0.1 1.22 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma
> > > 0.61 -0.1 0.54 perf-profile.calltrace.cycles-pp.mas_prev_setup.mas_prev.vma_merge.copy_vma.move_vma
> > > 1.36 -0.1 1.29 perf-profile.calltrace.cycles-pp.kmem_cache_free.unlink_anon_vmas.free_pgtables.unmap_region.do_vmi_align_munmap
> > > 1.40 -0.1 1.33 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma.move_vma
> > > 0.70 -0.1 0.64 perf-profile.calltrace.cycles-pp.mas_walk.find_vma_prev.copy_vma.move_vma.__do_sys_mremap
> > > 1.23 -0.1 1.17 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma
> > > 1.66 -0.1 1.60 perf-profile.calltrace.cycles-pp.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > > 1.16 -0.1 1.10 perf-profile.calltrace.cycles-pp.anon_vma_clone.copy_vma.move_vma.__do_sys_mremap.do_syscall_64
> > > 0.96 -0.1 0.90 perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.unlink_anon_vmas.free_pgtables.unmap_region
> > > 1.14 -0.1 1.08 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap
> > > 0.79 -0.1 0.74 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma
> > > 1.04 -0.1 1.00 perf-profile.calltrace.cycles-pp.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > > 0.58 -0.0 0.53 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.mremap
> > > 0.61 -0.0 0.56 perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core
> > > 0.56 -0.0 0.52 perf-profile.calltrace.cycles-pp.mas_find.find_vma_prev.copy_vma.move_vma.__do_sys_mremap
> > > 0.57 -0.0 0.53 ą 2% perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs
> > > 0.78 -0.0 0.74 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge
> > > 0.88 -0.0 0.84 perf-profile.calltrace.cycles-pp.mtree_load.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64
> > > 0.70 -0.0 0.66 perf-profile.calltrace.cycles-pp.__call_rcu_common.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
> > > 0.68 -0.0 0.64 perf-profile.calltrace.cycles-pp.percpu_counter_add_batch.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
> > > 0.68 -0.0 0.64 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.mremap
> > > 0.97 -0.0 0.93 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
> > > 1.11 -0.0 1.08 perf-profile.calltrace.cycles-pp.clear_bhb_loop.mremap
> > > 0.75 -0.0 0.72 perf-profile.calltrace.cycles-pp.allocate_slab.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma
> > > 0.74 -0.0 0.71 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.anon_vma_clone.copy_vma.move_vma.__do_sys_mremap
> > > 0.60 ą 2% -0.0 0.57 perf-profile.calltrace.cycles-pp.security_mmap_addr.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64
> > > 0.67 ą 2% -0.0 0.64 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma
> > > 0.82 -0.0 0.79 perf-profile.calltrace.cycles-pp.mtree_load.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
> > > 0.63 -0.0 0.60 perf-profile.calltrace.cycles-pp.mas_prev_slot.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
> > > 0.99 -0.0 0.96 perf-profile.calltrace.cycles-pp.mt_find.vma_merge.copy_vma.move_vma.__do_sys_mremap
> > > 0.62 ą 2% -0.0 0.59 perf-profile.calltrace.cycles-pp.get_old_pud.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64
> > > 0.87 -0.0 0.84 perf-profile.calltrace.cycles-pp.userfaultfd_unmap_complete.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
> > > 0.78 -0.0 0.75 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_link.copy_vma.move_vma.__do_sys_mremap
> > > 0.64 -0.0 0.62 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_link.copy_vma.move_vma.__do_sys_mremap
> > > 0.90 -0.0 0.87 perf-profile.calltrace.cycles-pp.mtree_load.vma_merge.copy_vma.move_vma.__do_sys_mremap
> > > 0.54 -0.0 0.52 perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.move_ptes.move_page_tables.move_vma.__do_sys_mremap
> > > 1.04 +0.0 1.08 perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region
> > > 0.76 +0.1 0.83 perf-profile.calltrace.cycles-pp.__madvise
> > > 0.63 +0.1 0.70 perf-profile.calltrace.cycles-pp.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
> > > 0.62 +0.1 0.70 perf-profile.calltrace.cycles-pp.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
> > > 0.66 +0.1 0.74 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__madvise
> > > 0.66 +0.1 0.74 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
> > > 87.74 +0.7 88.45 perf-profile.calltrace.cycles-pp.mremap
> > > 0.00 +0.9 0.86 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.do_munmap
> > > 0.00 +0.9 0.86 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.mremap_to.__do_sys_mremap
> > > 84.88 +0.9 85.77 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.mremap
> > > 84.73 +0.9 85.62 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
> > > 0.00 +0.9 0.92 ą 2% perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.move_vma
> > > 83.84 +0.9 84.78 perf-profile.calltrace.cycles-pp.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
> > > 0.00 +1.1 1.06 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64
> > > 0.00 +1.2 1.21 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to
> > > 2.07 +1.5 3.55 perf-profile.calltrace.cycles-pp.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > > 1.58 +1.5 3.07 perf-profile.calltrace.cycles-pp.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64
> > > 0.00 +1.5 1.52 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap
> > > 0.00 +1.6 1.57 perf-profile.calltrace.cycles-pp.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
> > > 0.00 +1.7 1.72 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap
> > > 0.00 +2.0 2.01 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64
> > > 5.39 +2.9 8.32 perf-profile.calltrace.cycles-pp.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
> > > 75.29 -1.9 73.37 perf-profile.children.cycles-pp.move_vma
> > > 37.06 -1.6 35.50 perf-profile.children.cycles-pp.do_vmi_align_munmap
> > > 24.98 -1.2 23.80 perf-profile.children.cycles-pp.copy_vma
> > > 19.99 -1.0 19.02 perf-profile.children.cycles-pp.handle_softirqs
> > > 19.97 -1.0 19.00 perf-profile.children.cycles-pp.rcu_core
> > > 19.95 -1.0 18.98 perf-profile.children.cycles-pp.rcu_do_batch
> > > 19.98 -0.9 19.06 perf-profile.children.cycles-pp.__split_vma
> > > 17.55 -0.8 16.76 perf-profile.children.cycles-pp.kmem_cache_free
> > > 10.56 ą 2% -0.8 9.79 ą 2% perf-profile.children.cycles-pp.run_ksoftirqd
> > > 10.57 ą 2% -0.8 9.80 ą 2% perf-profile.children.cycles-pp.smpboot_thread_fn
> > > 15.38 -0.8 14.62 perf-profile.children.cycles-pp.kmem_cache_alloc_noprof
> > > 10.62 ą 2% -0.8 9.85 ą 2% perf-profile.children.cycles-pp.kthread
> > > 10.62 ą 2% -0.8 9.86 ą 2% perf-profile.children.cycles-pp.ret_from_fork
> > > 10.62 ą 2% -0.8 9.86 ą 2% perf-profile.children.cycles-pp.ret_from_fork_asm
> > > 15.14 -0.7 14.44 perf-profile.children.cycles-pp.vma_merge
> > > 12.08 -0.5 11.55 perf-profile.children.cycles-pp.__slab_free
> > > 12.11 -0.5 11.62 perf-profile.children.cycles-pp.mas_wr_store_entry
> > > 10.86 -0.5 10.39 perf-profile.children.cycles-pp.vm_area_dup
> > > 11.89 -0.5 11.44 perf-profile.children.cycles-pp.mas_store_prealloc
> > > 8.49 -0.4 8.06 perf-profile.children.cycles-pp.__memcg_slab_post_alloc_hook
> > > 9.88 -0.4 9.49 perf-profile.children.cycles-pp.mas_wr_node_store
> > > 7.91 -0.3 7.58 perf-profile.children.cycles-pp.move_page_tables
> > > 6.06 -0.3 5.78 perf-profile.children.cycles-pp.vm_area_free_rcu_cb
> > > 8.28 -0.3 8.00 perf-profile.children.cycles-pp.unmap_region
> > > 6.69 -0.3 6.42 perf-profile.children.cycles-pp.vma_complete
> > > 5.06 -0.3 4.80 perf-profile.children.cycles-pp.mas_preallocate
> > > 5.82 -0.2 5.57 perf-profile.children.cycles-pp.move_ptes
> > > 4.24 -0.2 4.01 perf-profile.children.cycles-pp.anon_vma_clone
> > > 3.50 -0.2 3.30 perf-profile.children.cycles-pp.down_write
> > > 2.44 -0.2 2.25 perf-profile.children.cycles-pp.find_vma_prev
> > > 3.46 -0.2 3.28 perf-profile.children.cycles-pp.___slab_alloc
> > > 3.45 -0.2 3.27 perf-profile.children.cycles-pp.free_pgtables
> > > 2.54 -0.2 2.37 perf-profile.children.cycles-pp.rcu_cblist_dequeue
> > > 3.35 -0.2 3.18 perf-profile.children.cycles-pp.__memcg_slab_free_hook
> > > 2.93 -0.2 2.78 perf-profile.children.cycles-pp.mas_alloc_nodes
> > > 2.28 ą 2% -0.2 2.12 ą 2% perf-profile.children.cycles-pp.vma_prepare
> > > 3.46 -0.1 3.32 perf-profile.children.cycles-pp.flush_tlb_mm_range
> > > 3.41 -0.1 3.27 ą 2% perf-profile.children.cycles-pp.mod_objcg_state
> > > 2.76 -0.1 2.63 perf-profile.children.cycles-pp.unlink_anon_vmas
> > > 3.41 -0.1 3.28 perf-profile.children.cycles-pp.mas_store_gfp
> > > 2.21 -0.1 2.09 perf-profile.children.cycles-pp.__cond_resched
> > > 2.04 -0.1 1.94 perf-profile.children.cycles-pp.allocate_slab
> > > 2.10 -0.1 2.00 perf-profile.children.cycles-pp.__call_rcu_common
> > > 2.51 -0.1 2.40 perf-profile.children.cycles-pp.flush_tlb_func
> > > 1.04 -0.1 0.94 perf-profile.children.cycles-pp.mas_prev
> > > 2.71 -0.1 2.61 perf-profile.children.cycles-pp.mtree_load
> > > 2.23 -0.1 2.14 perf-profile.children.cycles-pp.native_flush_tlb_one_user
> > > 0.22 ą 5% -0.1 0.13 ą 13% perf-profile.children.cycles-pp.vm_stat_account
> > > 0.95 -0.1 0.87 perf-profile.children.cycles-pp.mas_prev_setup
> > > 1.65 -0.1 1.57 perf-profile.children.cycles-pp.mas_wr_walk
> > > 1.84 -0.1 1.76 perf-profile.children.cycles-pp.up_write
> > > 1.27 -0.1 1.20 perf-profile.children.cycles-pp.mas_prev_slot
> > > 1.84 -0.1 1.77 perf-profile.children.cycles-pp.vma_link
> > > 1.39 -0.1 1.32 perf-profile.children.cycles-pp.shuffle_freelist
> > > 0.96 -0.1 0.90 ą 2% perf-profile.children.cycles-pp.rcu_all_qs
> > > 0.86 -0.1 0.80 perf-profile.children.cycles-pp._raw_spin_lock_irqsave
> > > 1.70 -0.1 1.64 perf-profile.children.cycles-pp.__get_unmapped_area
> > > 0.34 ą 3% -0.1 0.29 ą 5% perf-profile.children.cycles-pp.security_vm_enough_memory_mm
> > > 0.60 -0.0 0.55 perf-profile.children.cycles-pp.entry_SYSCALL_64
> > > 0.92 -0.0 0.87 perf-profile.children.cycles-pp.percpu_counter_add_batch
> > > 1.07 -0.0 1.02 perf-profile.children.cycles-pp.vma_to_resize
> > > 1.59 -0.0 1.54 perf-profile.children.cycles-pp.mas_update_gap
> > > 0.44 ą 2% -0.0 0.40 ą 2% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
> > > 0.70 -0.0 0.66 perf-profile.children.cycles-pp.syscall_return_via_sysret
> > > 1.13 -0.0 1.09 perf-profile.children.cycles-pp.mt_find
> > > 0.20 ą 6% -0.0 0.17 ą 9% perf-profile.children.cycles-pp.cap_vm_enough_memory
> > > 0.99 -0.0 0.95 perf-profile.children.cycles-pp.mas_pop_node
> > > 0.63 ą 2% -0.0 0.59 perf-profile.children.cycles-pp.security_mmap_addr
> > > 0.62 -0.0 0.59 perf-profile.children.cycles-pp.__put_partials
> > > 1.17 -0.0 1.14 perf-profile.children.cycles-pp.clear_bhb_loop
> > > 0.46 -0.0 0.43 ą 2% perf-profile.children.cycles-pp.__alloc_pages_noprof
> > > 0.44 -0.0 0.41 ą 2% perf-profile.children.cycles-pp.get_page_from_freelist
> > > 0.90 -0.0 0.87 perf-profile.children.cycles-pp.userfaultfd_unmap_complete
> > > 0.64 ą 2% -0.0 0.62 perf-profile.children.cycles-pp.get_old_pud
> > > 1.07 -0.0 1.05 perf-profile.children.cycles-pp.mas_leaf_max_gap
> > > 0.22 ą 3% -0.0 0.20 ą 2% perf-profile.children.cycles-pp.__rmqueue_pcplist
> > > 0.55 -0.0 0.53 perf-profile.children.cycles-pp.refill_obj_stock
> > > 0.25 -0.0 0.23 ą 3% perf-profile.children.cycles-pp.rmqueue
> > > 0.48 -0.0 0.45 perf-profile.children.cycles-pp.mremap_userfaultfd_prep
> > > 0.33 -0.0 0.30 perf-profile.children.cycles-pp.free_unref_page
> > > 0.46 -0.0 0.44 perf-profile.children.cycles-pp.setup_object
> > > 0.21 ą 3% -0.0 0.19 ą 2% perf-profile.children.cycles-pp.rmqueue_bulk
> > > 0.31 ą 3% -0.0 0.29 perf-profile.children.cycles-pp.__vm_enough_memory
> > > 0.40 -0.0 0.38 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
> > > 0.36 -0.0 0.35 perf-profile.children.cycles-pp.madvise_vma_behavior
> > > 0.54 -0.0 0.53 ą 2% perf-profile.children.cycles-pp.mas_wr_end_piv
> > > 0.46 -0.0 0.44 ą 2% perf-profile.children.cycles-pp.rcu_segcblist_enqueue
> > > 0.34 -0.0 0.32 ą 2% perf-profile.children.cycles-pp.mas_destroy
> > > 0.28 -0.0 0.26 ą 3% perf-profile.children.cycles-pp.mas_wr_store_setup
> > > 0.30 -0.0 0.28 perf-profile.children.cycles-pp.pte_offset_map_nolock
> > > 0.19 -0.0 0.18 ą 2% perf-profile.children.cycles-pp.__thp_vma_allowable_orders
> > > 0.08 ą 4% -0.0 0.07 perf-profile.children.cycles-pp.ksm_madvise
> > > 0.17 -0.0 0.16 perf-profile.children.cycles-pp.get_any_partial
> > > 0.08 -0.0 0.07 perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
> > > 0.45 +0.0 0.47 perf-profile.children.cycles-pp._raw_spin_lock
> > > 1.10 +0.0 1.14 perf-profile.children.cycles-pp.zap_pte_range
> > > 0.78 +0.1 0.85 perf-profile.children.cycles-pp.__madvise
> > > 0.63 +0.1 0.70 perf-profile.children.cycles-pp.__x64_sys_madvise
> > > 0.62 +0.1 0.70 perf-profile.children.cycles-pp.do_madvise
> > > 0.00 +0.1 0.09 ą 4% perf-profile.children.cycles-pp.can_modify_mm_madv
> > > 1.32 +0.1 1.46 perf-profile.children.cycles-pp.mas_next_slot
> > > 88.13 +0.7 88.83 perf-profile.children.cycles-pp.mremap
> > > 83.94 +0.9 84.88 perf-profile.children.cycles-pp.__do_sys_mremap
> > > 86.06 +0.9 87.00 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
> > > 85.56 +1.0 86.54 perf-profile.children.cycles-pp.do_syscall_64
> > > 40.49 +1.4 41.90 perf-profile.children.cycles-pp.do_vmi_munmap
> > > 2.10 +1.5 3.57 perf-profile.children.cycles-pp.do_munmap
> > > 3.62 +2.3 5.90 perf-profile.children.cycles-pp.mas_walk
> > > 5.44 +2.9 8.38 perf-profile.children.cycles-pp.mremap_to
> > > 5.30 +3.1 8.39 perf-profile.children.cycles-pp.mas_find
> > > 0.00 +5.4 5.40 perf-profile.children.cycles-pp.can_modify_mm
> > > 11.46 -0.5 10.96 perf-profile.self.cycles-pp.__slab_free
> > > 4.30 -0.2 4.08 perf-profile.self.cycles-pp.__memcg_slab_post_alloc_hook
> > > 2.51 -0.2 2.34 perf-profile.self.cycles-pp.rcu_cblist_dequeue
> > > 2.41 ą 2% -0.2 2.25 perf-profile.self.cycles-pp.down_write
> > > 2.21 -0.1 2.11 perf-profile.self.cycles-pp.native_flush_tlb_one_user
> > > 2.37 -0.1 2.28 perf-profile.self.cycles-pp.mtree_load
> > > 1.60 -0.1 1.51 perf-profile.self.cycles-pp.__memcg_slab_free_hook
> > > 0.18 ą 3% -0.1 0.10 ą 15% perf-profile.self.cycles-pp.vm_stat_account
> > > 1.25 -0.1 1.18 perf-profile.self.cycles-pp.move_vma
> > > 1.76 -0.1 1.69 perf-profile.self.cycles-pp.mod_objcg_state
> > > 1.42 -0.1 1.35 ą 2% perf-profile.self.cycles-pp.__call_rcu_common
> > > 1.41 -0.1 1.34 perf-profile.self.cycles-pp.mas_wr_walk
> > > 1.52 -0.1 1.46 perf-profile.self.cycles-pp.up_write
> > > 1.02 -0.1 0.95 perf-profile.self.cycles-pp.mas_prev_slot
> > > 0.96 -0.1 0.90 ą 2% perf-profile.self.cycles-pp.vm_area_free_rcu_cb
> > > 1.50 -0.1 1.45 perf-profile.self.cycles-pp.kmem_cache_free
> > > 0.69 ą 3% -0.1 0.64 ą 2% perf-profile.self.cycles-pp.rcu_all_qs
> > > 1.14 ą 2% -0.1 1.09 perf-profile.self.cycles-pp.shuffle_freelist
> > > 1.10 -0.1 1.05 perf-profile.self.cycles-pp.__cond_resched
> > > 1.40 -0.0 1.35 perf-profile.self.cycles-pp.do_vmi_align_munmap
> > > 0.99 -0.0 0.94 perf-profile.self.cycles-pp.mas_preallocate
> > > 0.88 -0.0 0.83 perf-profile.self.cycles-pp.___slab_alloc
> > > 0.55 -0.0 0.50 perf-profile.self.cycles-pp.mremap_to
> > > 0.98 -0.0 0.93 perf-profile.self.cycles-pp.move_ptes
> > > 0.78 -0.0 0.74 perf-profile.self.cycles-pp.percpu_counter_add_batch
> > > 0.21 ą 2% -0.0 0.18 ą 2% perf-profile.self.cycles-pp.entry_SYSCALL_64
> > > 0.44 ą 2% -0.0 0.40 ą 2% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
> > > 0.92 -0.0 0.89 perf-profile.self.cycles-pp.mas_store_gfp
> > > 0.86 -0.0 0.82 perf-profile.self.cycles-pp.mas_pop_node
> > > 0.50 -0.0 0.46 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
> > > 1.15 -0.0 1.12 perf-profile.self.cycles-pp.clear_bhb_loop
> > > 1.14 -0.0 1.11 perf-profile.self.cycles-pp.vma_merge
> > > 0.66 -0.0 0.63 perf-profile.self.cycles-pp.__split_vma
> > > 0.16 ą 6% -0.0 0.13 ą 7% perf-profile.self.cycles-pp.cap_vm_enough_memory
> > > 0.82 -0.0 0.79 perf-profile.self.cycles-pp.mas_wr_store_entry
> > > 0.54 ą 2% -0.0 0.52 perf-profile.self.cycles-pp.get_old_pud
> > > 0.43 -0.0 0.40 perf-profile.self.cycles-pp.do_munmap
> > > 0.51 ą 2% -0.0 0.48 ą 2% perf-profile.self.cycles-pp.security_mmap_addr
> > > 0.50 -0.0 0.48 perf-profile.self.cycles-pp.refill_obj_stock
> > > 0.24 -0.0 0.22 perf-profile.self.cycles-pp.mas_prev
> > > 0.71 -0.0 0.69 perf-profile.self.cycles-pp.unmap_page_range
> > > 0.48 -0.0 0.45 perf-profile.self.cycles-pp.find_vma_prev
> > > 0.42 -0.0 0.40 perf-profile.self.cycles-pp._raw_spin_lock_irqsave
> > > 0.66 -0.0 0.64 perf-profile.self.cycles-pp.mas_store_prealloc
> > > 0.31 -0.0 0.29 perf-profile.self.cycles-pp.mas_prev_setup
> > > 0.43 -0.0 0.41 perf-profile.self.cycles-pp.mas_wr_end_piv
> > > 0.78 -0.0 0.76 perf-profile.self.cycles-pp.userfaultfd_unmap_complete
> > > 0.28 -0.0 0.26 ą 2% perf-profile.self.cycles-pp.mas_put_in_tree
> > > 0.42 -0.0 0.40 perf-profile.self.cycles-pp.mremap_userfaultfd_prep
> > > 0.28 -0.0 0.26 perf-profile.self.cycles-pp.free_pgtables
> > > 0.39 -0.0 0.37 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
> > > 0.30 ą 2% -0.0 0.28 perf-profile.self.cycles-pp.zap_pmd_range
> > > 0.32 -0.0 0.31 perf-profile.self.cycles-pp.unmap_vmas
> > > 0.21 -0.0 0.20 perf-profile.self.cycles-pp.__get_unmapped_area
> > > 0.18 ą 2% -0.0 0.17 ą 2% perf-profile.self.cycles-pp.lru_add_drain_cpu
> > > 0.06 -0.0 0.05 perf-profile.self.cycles-pp.ksm_madvise
> > > 0.45 +0.0 0.46 perf-profile.self.cycles-pp.do_vmi_munmap
> > > 0.37 +0.0 0.39 perf-profile.self.cycles-pp._raw_spin_lock
> > > 1.06 +0.1 1.18 perf-profile.self.cycles-pp.mas_next_slot
> > > 1.50 +0.5 1.97 perf-profile.self.cycles-pp.mas_find
> > > 0.00 +1.4 1.35 perf-profile.self.cycles-pp.can_modify_mm
> > > 3.13 +2.0 5.13 perf-profile.self.cycles-pp.mas_walk
> > >
> > >
> > > ***************************************************************************************************
> > > lkp-spr-r02: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory
> > > =========================================================================================
> > > compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
> > > gcc-13/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-spr-r02/pkey/stress-ng/60s
> > >
> > > commit:
> > > ff388fe5c4 ("mseal: wire up mseal syscall")
> > > 8be7258aad ("mseal: add mseal syscall")
> > >
> > > ff388fe5c481d39c 8be7258aad44b5e25977a98db13
> > > ---------------- ---------------------------
> > > %stddev %change %stddev
> > > \ | \
> > > 10539 -2.5% 10273 vmstat.system.cs
> > > 0.28 ą 5% -20.1% 0.22 ą 7% sched_debug.cfs_rq:/.h_nr_running.stddev
> > > 1419 ą 7% -15.3% 1202 ą 6% sched_debug.cfs_rq:/.util_avg.max
> > > 0.28 ą 6% -18.4% 0.23 ą 8% sched_debug.cpu.nr_running.stddev
> > > 8.736e+08 -3.6% 8.423e+08 stress-ng.pkey.ops
> > > 14560560 -3.6% 14038795 stress-ng.pkey.ops_per_sec
> > > 770.39 ą 4% -5.0% 732.04 stress-ng.time.user_time
> > > 244657 ą 3% +5.8% 258782 ą 3% proc-vmstat.nr_slab_unreclaimable
> > > 73133541 -2.1% 71588873 proc-vmstat.numa_hit
> > > 72873579 -2.1% 71357274 proc-vmstat.numa_local
> > > 1.842e+08 -2.5% 1.796e+08 proc-vmstat.pgalloc_normal
> > > 1.767e+08 -2.8% 1.717e+08 proc-vmstat.pgfree
> > > 1345346 ą 40% -73.1% 362064 ą124% numa-vmstat.node0.nr_inactive_anon
> > > 1345340 ą 40% -73.1% 362062 ą124% numa-vmstat.node0.nr_zone_inactive_anon
> > > 2420830 ą 14% +35.1% 3270248 ą 16% numa-vmstat.node1.nr_file_pages
> > > 2067871 ą 13% +51.5% 3132982 ą 17% numa-vmstat.node1.nr_inactive_anon
> > > 191406 ą 17% +33.6% 255808 ą 14% numa-vmstat.node1.nr_mapped
> > > 2452 ą 61% +104.4% 5012 ą 35% numa-vmstat.node1.nr_page_table_pages
> > > 2067853 ą 13% +51.5% 3132966 ą 17% numa-vmstat.node1.nr_zone_inactive_anon
> > > 5379238 ą 40% -73.0% 1453605 ą123% numa-meminfo.node0.Inactive
> > > 5379166 ą 40% -73.0% 1453462 ą123% numa-meminfo.node0.Inactive(anon)
> > > 8741077 ą 22% -36.7% 5531290 ą 28% numa-meminfo.node0.MemUsed
> > > 9651902 ą 13% +35.8% 13105318 ą 16% numa-meminfo.node1.FilePages
> > > 8239855 ą 13% +52.4% 12556929 ą 17% numa-meminfo.node1.Inactive
> > > 8239712 ą 13% +52.4% 12556853 ą 17% numa-meminfo.node1.Inactive(anon)
> > > 761944 ą 18% +34.6% 1025906 ą 14% numa-meminfo.node1.Mapped
> > > 11679628 ą 11% +31.2% 15322841 ą 14% numa-meminfo.node1.MemUsed
> > > 9874 ą 62% +104.6% 20200 ą 36% numa-meminfo.node1.PageTables
> > > 0.74 -4.2% 0.71 perf-stat.i.MPKI
> > > 1.245e+11 +2.3% 1.274e+11 perf-stat.i.branch-instructions
> > > 0.37 -0.0 0.35 perf-stat.i.branch-miss-rate%
> > > 4.359e+08 -2.1% 4.265e+08 perf-stat.i.branch-misses
> > > 4.672e+08 -2.6% 4.548e+08 perf-stat.i.cache-misses
> > > 7.276e+08 -2.7% 7.082e+08 perf-stat.i.cache-references
> > > 1.00 -1.6% 0.98 perf-stat.i.cpi
> > > 1364 +2.9% 1404 perf-stat.i.cycles-between-cache-misses
> > > 6.392e+11 +1.7% 6.499e+11 perf-stat.i.instructions
> > > 1.00 +1.6% 1.02 perf-stat.i.ipc
> > > 0.74 -4.3% 0.71 perf-stat.overall.MPKI
> > > 0.35 -0.0 0.33 perf-stat.overall.branch-miss-rate%
> > > 1.00 -1.6% 0.99 perf-stat.overall.cpi
> > > 1356 +2.9% 1395 perf-stat.overall.cycles-between-cache-misses
> > > 1.00 +1.6% 1.01 perf-stat.overall.ipc
> > > 1.209e+11 +1.9% 1.232e+11 perf-stat.ps.branch-instructions
> > > 4.188e+08 -2.6% 4.077e+08 perf-stat.ps.branch-misses
> > > 4.585e+08 -3.1% 4.441e+08 perf-stat.ps.cache-misses
> > > 7.124e+08 -3.1% 6.901e+08 perf-stat.ps.cache-references
> > > 10321 -2.6% 10053 perf-stat.ps.context-switches
> > >
> > >
> > >
> > >
> > >
> > > Disclaimer:
> > > Results have been estimated based on internal Intel analysis and are provided
> > > for informational purposes only. Any difference in system hardware or software
> > > design or configuration may affect actual performance.
> > >
> > >
> > > --
> > > 0-DAY CI Kernel Test Service
> > > https://github.com/intel/lkp-tests/wiki
> > >
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression
2024-08-06 6:04 ` Oliver Sang
2024-08-06 14:38 ` Linus Torvalds
@ 2024-08-06 21:37 ` Pedro Falcato
2024-08-07 5:54 ` Oliver Sang
1 sibling, 1 reply; 29+ messages in thread
From: Pedro Falcato @ 2024-08-06 21:37 UTC (permalink / raw)
To: Oliver Sang
Cc: Linus Torvalds, Jeff Xu, Michael Ellerman, Nicholas Piggin,
Christophe Leroy, Jeff Xu, oe-lkp, lkp, linux-kernel,
Andrew Morton, Kees Cook, Liam R. Howlett, Dave Hansen,
Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jonathan Corbet,
Jorge Lucangeli Obes, Matthew Wilcox, Muhammad Usama Anjum,
Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany,
Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang,
feng.tang, fengwei.yin
On Tue, Aug 6, 2024 at 7:05 AM Oliver Sang <oliver.sang@intel.com> wrote:
>
> hi, Linus,
>
> On Mon, Aug 05, 2024 at 12:33:58PM -0700, Linus Torvalds wrote:
> > On Mon, 5 Aug 2024 at 11:55, Linus Torvalds
> > <torvalds@linux-foundation.org> wrote:
> > >
> > > So please consider this a "maybe something like this" patch, but that
> > > 'arch_unmap()' really is pretty nasty
> >
> > Actually, the whole powerpc vdso code confused me. It's not the vvar
> > thing that wants this close thing, it's the other ones that have the
> > remap thing.
> >
> > .. and there were two of those error cases that needed to reset the
> > vdso pointer.
> >
> > That all shows just how carefully I was reading this code.
> >
> > New version - still untested, but now I've read through it one more
> > time - attached.
>
> we tested this version by applying it directly upon 8be7258aad, but seems it
> have little impact to performance. still similar regression if comparing to
> ff388fe5c4.
Hi,
I've just sent out a patch set[1] that should alleviate (or hopefully
totally fix) these performance regressions. It'd be great if you could
test it.
For everyone: Apologies if you're in the CC list and I didn't CC you,
but I tried to keep my patch set's CC list relatively short and clean
(and I focused on the active participants).
Everyone's comments are very welcome.
[1]: https://lore.kernel.org/all/20240806212808.1885309-1-pedro.falcato@gmail.com/
--
Pedro
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression
2024-08-06 21:37 ` Pedro Falcato
@ 2024-08-07 5:54 ` Oliver Sang
0 siblings, 0 replies; 29+ messages in thread
From: Oliver Sang @ 2024-08-07 5:54 UTC (permalink / raw)
To: Pedro Falcato
Cc: Linus Torvalds, Jeff Xu, Michael Ellerman, Nicholas Piggin,
Christophe Leroy, Jeff Xu, oe-lkp, lkp, linux-kernel,
Andrew Morton, Kees Cook, Liam R. Howlett, Dave Hansen,
Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jonathan Corbet,
Jorge Lucangeli Obes, Matthew Wilcox, Muhammad Usama Anjum,
Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany,
Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang,
feng.tang, fengwei.yin, oliver.sang
hi, Pedro,
On Tue, Aug 06, 2024 at 10:37:08PM +0100, Pedro Falcato wrote:
> On Tue, Aug 6, 2024 at 7:05 AM Oliver Sang <oliver.sang@intel.com> wrote:
> >
> > hi, Linus,
> >
> > On Mon, Aug 05, 2024 at 12:33:58PM -0700, Linus Torvalds wrote:
> > > On Mon, 5 Aug 2024 at 11:55, Linus Torvalds
> > > <torvalds@linux-foundation.org> wrote:
> > > >
> > > > So please consider this a "maybe something like this" patch, but that
> > > > 'arch_unmap()' really is pretty nasty
> > >
> > > Actually, the whole powerpc vdso code confused me. It's not the vvar
> > > thing that wants this close thing, it's the other ones that have the
> > > remap thing.
> > >
> > > .. and there were two of those error cases that needed to reset the
> > > vdso pointer.
> > >
> > > That all shows just how carefully I was reading this code.
> > >
> > > New version - still untested, but now I've read through it one more
> > > time - attached.
> >
> > we tested this version by applying it directly upon 8be7258aad, but seems it
> > have little impact to performance. still similar regression if comparing to
> > ff388fe5c4.
>
> Hi,
>
> I've just sent out a patch set[1] that should alleviate (or hopefully
> totally fix) these performance regressions. It'd be great if you could
> test it.
yes, your patch set totally fixes the regression.
our bot automatically fetch the patch set and apply it upon mainline
d4560686726f7 as below.
d58de4f958df2 (linux-review/Pedro-Falcato/mm-Move-can_modify_vma-to-mm-internal-h/20240807-054658) mm: Remove can_modify_mm()
32668c3efc23f mseal: Replace can_modify_mm_madv with a vma variant
5c3f48cf634c9 mseal: Fix is_madv_discard()
8cde2d71bd0f8 mm/mremap: Replace can_modify_mm with can_modify_vma
cc3471461a854 mm/mprotect: Replace can_modify_mm with can_modify_vma
abff8a9b6023e mm/munmap: Replace can_modify_mm with can_modify_vma
c1bf07aa19804 mm: Move can_modify_vma to mm/internal.h
d4560686726f7 (HEAD, linus/master) Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
I tested patch set tip d58de4f958df2 as well as d4560686726f7, below is the
results combining with 8be7258aad and its parent.
data from 8be7258aad and d4560686726f7 are close enough to within the noise.
the patch set tip recover the performance to the level of ff388fe5c4.
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-12/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/pagemove/stress-ng/60s
commit:
ff388fe5c4 ("mseal: wire up mseal syscall")
8be7258aad ("mseal: add mseal syscall")
d456068672 ("Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost")
d58de4f958 ("mm: Remove can_modify_mm()")
ff388fe5c481d39c 8be7258aad44b5e25977a98db13 d4560686726f7a357922f300fc8 d58de4f958df225c04fd490fe2d
---------------- --------------------------- --------------------------- ---------------------------
%stddev %change %stddev %change %stddev %change %stddev
\ | \ | \ | \
44.92 -0.4% 44.76 -5.1% 42.62 -5.7% 42.37 boot-time.boot
33.12 -0.4% 33.00 -7.0% 30.81 -7.0% 30.81 boot-time.dhcp
2631 -0.4% 2620 -5.6% 2483 -6.2% 2468 boot-time.idle
4958 +1.3% 5024 +1.2% 5017 +0.0% 4960 time.percent_of_cpu_this_job_got
2916 +1.5% 2960 +1.4% 2956 +0.1% 2919 time.system_time
65.85 -7.0% 61.27 -6.8% 61.40 -3.4% 63.64 time.user_time
17869 ± 8% -5.6% 16869 ± 28% -24.5% 13488 ± 25% -3.5% 17240 ± 9% numa-vmstat.node0.nr_slab_reclaimable
5182 ± 29% +19.8% 6207 ± 75% +80.1% 9334 ± 36% +7.9% 5591 ± 28% numa-vmstat.node1.nr_slab_reclaimable
10153 ±170% +1041.4% 115893 ±214% +2787.4% 293183 ± 97% +371.7% 47894 ± 90% numa-vmstat.node1.nr_unevictable
10153 ±170% +1041.4% 115893 ±214% +2787.4% 293183 ± 97% +371.7% 47894 ± 90% numa-vmstat.node1.nr_zone_unevictable
71475 ± 8% -5.6% 67478 ± 28% -24.5% 53952 ± 25% -3.5% 68960 ± 9% numa-meminfo.node0.KReclaimable
71475 ± 8% -5.6% 67478 ± 28% -24.5% 53952 ± 25% -3.5% 68960 ± 9% numa-meminfo.node0.SReclaimable
20732 ± 29% +19.8% 24839 ± 75% +80.1% 37346 ± 36% +7.9% 22364 ± 28% numa-meminfo.node1.KReclaimable
20732 ± 29% +19.8% 24839 ± 75% +80.1% 37346 ± 36% +7.9% 22364 ± 28% numa-meminfo.node1.SReclaimable
40615 ±170% +1041.4% 463573 ±214% +2787.4% 1172733 ± 97% +371.7% 191576 ± 90% numa-meminfo.node1.Unevictable
23051 +0.1% 23079 -1.0% 22823 -1.0% 22831 proc-vmstat.nr_slab_reclaimable
41535129 -4.5% 39669773 -4.9% 39501465 -0.3% 41415171 proc-vmstat.numa_hit
41465484 -4.5% 39602956 -4.9% 39434855 -0.3% 41347677 proc-vmstat.numa_local
77303973 -4.6% 73780662 -5.0% 73449965 -0.3% 77049179 proc-vmstat.pgalloc_normal
77022096 -4.6% 73502058 -5.0% 73168463 -0.3% 76769054 proc-vmstat.pgfree
18381956 -4.9% 17473438 -5.1% 17450543 -0.4% 18316849 stress-ng.pagemove.ops
306349 -4.9% 291188 -5.1% 290820 -0.4% 305268 stress-ng.pagemove.ops_per_sec
209930 -6.2% 196996 ± 2% -5.4% 198614 -0.5% 208922 stress-ng.pagemove.page_remaps_per_sec
4958 +1.3% 5024 +1.2% 5017 +0.0% 4960 stress-ng.time.percent_of_cpu_this_job_got
2916 +1.5% 2960 +1.4% 2956 +0.1% 2919 stress-ng.time.system_time
3.337e+10 ± 4% +2.3% 3.414e+10 ± 3% +5.0% 3.503e+10 +1.2% 3.376e+10 perf-stat.i.branch-instructions
1.13 -2.1% 1.10 -2.3% 1.10 +0.1% 1.13 perf-stat.i.cpi
1.695e+11 ± 4% +1.1% 1.715e+11 ± 3% +3.8% 1.761e+11 +1.2% 1.715e+11 perf-stat.i.instructions
0.89 +2.2% 0.91 +2.1% 0.91 -0.4% 0.89 perf-stat.i.ipc
1.04 -7.2% 0.97 -7.2% 0.97 -0.2% 1.04 perf-stat.overall.MPKI
1.13 -2.3% 1.10 -2.1% 1.10 +0.3% 1.13 perf-stat.overall.cpi
1082 +5.4% 1140 +5.5% 1141 +0.5% 1087 perf-stat.overall.cycles-between-cache-misses
0.89 +2.3% 0.91 +2.1% 0.91 -0.3% 0.88 perf-stat.overall.ipc
3.284e+10 ± 4% +2.4% 3.362e+10 ± 2% +4.8% 3.443e+10 +1.1% 3.32e+10 perf-stat.ps.branch-instructions
192.79 -3.9% 185.32 ± 2% -1.7% 189.49 +0.2% 193.10 perf-stat.ps.cpu-migrations
1.669e+11 ± 4% +1.2% 1.689e+11 ± 2% +3.7% 1.731e+11 +1.1% 1.687e+11 perf-stat.ps.instructions
1.048e+13 +2.8% 1.078e+13 +2.1% 1.07e+13 -0.6% 1.042e+13 perf-stat.total.instructions
74.97 -1.9 73.07 -1.7 73.32 +0.4 75.38 perf-profile.calltrace.cycles-pp.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
36.79 -1.6 35.22 -1.4 35.36 +0.3 37.08 perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64
24.98 -1.3 23.64 -1.3 23.73 +0.0 24.99 perf-profile.calltrace.cycles-pp.copy_vma.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
19.91 -1.1 18.85 -1.2 18.69 -0.2 19.72 perf-profile.calltrace.cycles-pp.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
10.70 ± 3% -0.9 9.84 ± 3% -0.9 9.78 ± 2% -0.4 10.33 ± 3% perf-profile.calltrace.cycles-pp.kthread.ret_from_fork.ret_from_fork_asm
10.70 ± 3% -0.9 9.84 ± 3% -0.9 9.78 ± 2% -0.4 10.33 ± 3% perf-profile.calltrace.cycles-pp.ret_from_fork.ret_from_fork_asm
10.70 ± 3% -0.9 9.84 ± 3% -0.9 9.78 ± 2% -0.4 10.33 ± 3% perf-profile.calltrace.cycles-pp.ret_from_fork_asm
10.64 ± 3% -0.9 9.79 ± 3% -0.9 9.73 ± 2% -0.4 10.29 ± 3% perf-profile.calltrace.cycles-pp.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
10.63 ± 3% -0.9 9.78 ± 3% -0.9 9.72 ± 2% -0.4 10.28 ± 3% perf-profile.calltrace.cycles-pp.handle_softirqs.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork
10.63 ± 3% -0.9 9.78 ± 3% -0.9 9.72 ± 2% -0.4 10.28 ± 3% perf-profile.calltrace.cycles-pp.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
10.63 ± 3% -0.9 9.78 ± 3% -0.9 9.72 ± 2% -0.4 10.28 ± 3% perf-profile.calltrace.cycles-pp.rcu_core.handle_softirqs.run_ksoftirqd.smpboot_thread_fn.kthread
10.59 ± 3% -0.8 9.74 ± 3% -0.9 9.68 ± 2% -0.4 10.24 ± 3% perf-profile.calltrace.cycles-pp.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd.smpboot_thread_fn
14.77 -0.8 14.00 -0.7 14.11 +0.0 14.80 perf-profile.calltrace.cycles-pp.vma_merge.copy_vma.move_vma.__do_sys_mremap.do_syscall_64
1.48 -0.5 0.99 -0.5 0.99 +0.0 1.52 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64
5.95 ± 3% -0.5 5.47 ± 3% -0.5 5.44 ± 2% -0.2 5.73 ± 3% perf-profile.calltrace.cycles-pp.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd
7.88 -0.4 7.48 -0.3 7.57 +0.1 7.97 perf-profile.calltrace.cycles-pp.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
4.62 ± 3% -0.4 4.25 ± 3% -0.4 4.20 ± 2% -0.2 4.42 ± 3% perf-profile.calltrace.cycles-pp.__slab_free.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs
6.72 -0.4 6.36 -0.4 6.33 -0.1 6.66 perf-profile.calltrace.cycles-pp.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma
6.15 -0.3 5.82 -0.3 5.86 +0.0 6.16 perf-profile.calltrace.cycles-pp.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma
6.11 -0.3 5.78 -0.3 5.77 -0.0 6.07 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap
5.78 -0.3 5.49 -0.2 5.57 +0.1 5.85 perf-profile.calltrace.cycles-pp.move_ptes.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64
5.54 -0.3 5.25 -0.3 5.28 +0.0 5.56 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap
5.56 -0.3 5.28 -0.3 5.28 -0.0 5.54 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_merge.copy_vma.move_vma.__do_sys_mremap
5.19 -0.3 4.92 -0.2 4.95 +0.0 5.21 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap
5.20 -0.3 4.94 -0.3 4.95 -0.0 5.18 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma.move_vma
3.20 ± 4% -0.3 2.94 ± 3% -0.3 2.93 ± 2% -0.1 3.11 ± 3% perf-profile.calltrace.cycles-pp.vm_area_free_rcu_cb.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd
4.09 -0.2 3.85 -0.3 3.82 -0.1 4.03 perf-profile.calltrace.cycles-pp.vm_area_dup.copy_vma.move_vma.__do_sys_mremap.do_syscall_64
4.68 -0.2 4.45 -0.2 4.46 -0.0 4.67 perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma
2.63 ± 3% -0.2 2.42 ± 3% -0.2 2.43 ± 2% -0.1 2.57 ± 3% perf-profile.calltrace.cycles-pp.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core.handle_softirqs
2.36 ± 2% -0.2 2.16 ± 4% -0.3 2.04 ± 14% -0.1 2.28 ± 3% perf-profile.calltrace.cycles-pp.__memcpy.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_complete
3.56 -0.2 3.36 -0.2 3.34 -0.0 3.52 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma.__do_sys_mremap
4.00 -0.2 3.81 -0.1 3.87 ± 2% +0.1 4.06 perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma
1.35 -0.2 1.16 -0.2 1.16 +0.0 1.36 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap
3.40 -0.2 3.22 -0.2 3.24 +0.0 3.41 perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma.__do_sys_mremap
2.22 -0.2 2.06 -0.2 2.07 +0.0 2.24 perf-profile.calltrace.cycles-pp.find_vma_prev.copy_vma.move_vma.__do_sys_mremap.do_syscall_64
0.96 -0.2 0.82 -0.2 0.81 +0.0 0.97 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.do_munmap.mremap_to
3.25 -0.1 3.10 -0.1 3.14 +0.0 3.30 perf-profile.calltrace.cycles-pp.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
1.81 ± 4% -0.1 1.67 ± 3% -0.2 1.64 ± 2% -0.1 1.74 ± 3% perf-profile.calltrace.cycles-pp.__slab_free.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core
1.97 ± 3% -0.1 1.83 ± 3% -0.6 1.41 ± 3% -0.5 1.50 ± 2% perf-profile.calltrace.cycles-pp.mod_objcg_state.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.__split_vma
2.26 -0.1 2.12 -0.2 2.05 -0.1 2.16 perf-profile.calltrace.cycles-pp.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma
3.10 -0.1 2.96 +0.3 3.38 +0.5 3.60 perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap
3.13 -0.1 2.99 -0.1 3.06 +0.1 3.23 perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma
2.97 -0.1 2.85 -0.2 2.75 ± 2% -0.0 2.94 ± 2% perf-profile.calltrace.cycles-pp.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma
2.05 -0.1 1.93 -0.1 1.98 -0.1 1.99 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_merge.copy_vma.move_vma.__do_sys_mremap
8.26 -0.1 8.14 +0.2 8.45 +0.5 8.78 perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
2.45 -0.1 2.34 -0.1 2.34 +0.0 2.46 perf-profile.calltrace.cycles-pp.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma
2.43 -0.1 2.32 -0.0 2.39 +0.1 2.55 perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap
1.75 ± 2% -0.1 1.64 ± 3% -0.1 1.64 ± 4% +0.0 1.77 ± 4% perf-profile.calltrace.cycles-pp.vma_prepare.vma_merge.copy_vma.move_vma.__do_sys_mremap
0.54 -0.1 0.44 ± 37% -0.0 0.51 +0.0 0.55 perf-profile.calltrace.cycles-pp.mas_find.find_vma_prev.copy_vma.move_vma.__do_sys_mremap
1.27 ± 2% -0.1 1.16 ± 4% -0.1 1.14 ± 6% -0.0 1.23 ± 4% perf-profile.calltrace.cycles-pp.__memcpy.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge
1.32 ± 3% -0.1 1.22 ± 3% -0.1 1.20 ± 2% -0.0 1.28 ± 3% perf-profile.calltrace.cycles-pp.rcu_cblist_dequeue.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd
2.21 -0.1 2.11 -0.1 2.11 +0.0 2.23 perf-profile.calltrace.cycles-pp.native_flush_tlb_one_user.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables
1.85 -0.1 1.76 -0.1 1.78 +0.0 1.87 perf-profile.calltrace.cycles-pp.vma_link.copy_vma.move_vma.__do_sys_mremap.do_syscall_64
2.14 ± 2% -0.1 2.05 ± 2% -0.1 2.00 ± 2% +0.0 2.14 ± 2% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap
1.79 ± 2% -0.1 1.70 +0.1 1.93 +0.3 2.06 perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma
1.40 -0.1 1.31 -0.1 1.27 -0.1 1.34 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap
1.39 -0.1 1.30 -0.1 1.34 -0.1 1.33 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma.move_vma
1.24 -0.1 1.16 -0.1 1.13 -0.1 1.19 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap
0.94 -0.1 0.86 -0.1 0.86 +0.0 0.96 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap
1.23 -0.1 1.15 -0.0 1.18 -0.1 1.18 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma
1.54 -0.1 1.46 -0.0 1.50 +0.1 1.60 perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap
0.73 -0.1 0.67 -0.1 0.67 +0.0 0.74 perf-profile.calltrace.cycles-pp.mas_walk.find_vma_prev.copy_vma.move_vma.__do_sys_mremap
1.15 -0.1 1.09 -0.1 1.08 -0.0 1.13 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap
0.60 ± 2% -0.1 0.54 -0.0 0.56 -0.0 0.59 perf-profile.calltrace.cycles-pp.security_mmap_addr.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64
1.27 -0.1 1.21 -0.0 1.22 +0.0 1.30 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma
38.74 -0.1 38.68 +0.1 38.80 +0.3 39.06 perf-profile.calltrace.cycles-pp.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.38 ± 4% -0.1 1.32 ± 2% -0.2 1.20 ± 3% -0.1 1.27 ± 2% perf-profile.calltrace.cycles-pp.obj_cgroup_charge.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.__split_vma
0.72 -0.1 0.66 -0.1 0.66 +0.0 0.72 perf-profile.calltrace.cycles-pp.mas_prev.vma_merge.copy_vma.move_vma.__do_sys_mremap
0.70 ± 2% -0.1 0.64 ± 3% +0.1 0.80 ± 3% +0.2 0.85 ± 3% perf-profile.calltrace.cycles-pp.mod_objcg_state.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.copy_vma
0.79 -0.1 0.73 -0.1 0.73 +0.0 0.79 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma
0.80 ± 2% -0.1 0.75 -0.1 0.72 ± 3% -0.0 0.77 ± 2% perf-profile.calltrace.cycles-pp.__call_rcu_common.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge
0.78 -0.1 0.72 -0.0 0.73 +0.0 0.78 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge
1.02 -0.1 0.96 +0.0 1.02 +0.1 1.09 perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region
1.63 -0.1 1.58 -0.1 1.58 +0.0 1.64 perf-profile.calltrace.cycles-pp.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.62 -0.0 0.58 -0.1 0.57 +0.0 0.63 perf-profile.calltrace.cycles-pp.mas_prev_setup.mas_prev.vma_merge.copy_vma.move_vma
0.60 ± 3% -0.0 0.56 ± 3% -0.0 0.59 ± 3% +0.0 0.63 ± 3% perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core
0.67 -0.0 0.62 -0.1 0.59 -0.1 0.61 ± 2% perf-profile.calltrace.cycles-pp.percpu_counter_add_batch.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
0.86 -0.0 0.81 -0.0 0.82 +0.0 0.87 perf-profile.calltrace.cycles-pp.mtree_load.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64
1.02 -0.0 0.97 -0.0 0.98 +0.0 1.04 perf-profile.calltrace.cycles-pp.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.76 ± 2% -0.0 0.71 -0.1 0.71 ± 2% -0.0 0.74 ± 2% perf-profile.calltrace.cycles-pp.allocate_slab.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma
0.81 -0.0 0.77 -0.1 0.76 -0.0 0.81 perf-profile.calltrace.cycles-pp.mtree_load.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
0.70 -0.0 0.66 -0.0 0.66 -0.0 0.69 perf-profile.calltrace.cycles-pp.__call_rcu_common.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
0.67 ± 2% -0.0 0.63 -0.0 0.65 ± 2% +0.0 0.68 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.mremap
0.56 -0.0 0.51 -0.2 0.38 ± 57% +0.0 0.56 perf-profile.calltrace.cycles-pp.mas_walk.mas_prev_setup.mas_prev.vma_merge.copy_vma
0.69 -0.0 0.65 -0.0 0.64 ± 2% -0.0 0.68 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma
0.98 -0.0 0.93 -0.0 0.94 +0.0 0.98 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
0.77 ± 5% -0.0 0.73 ± 2% -0.1 0.66 ± 4% -0.1 0.70 ± 4% perf-profile.calltrace.cycles-pp.obj_cgroup_charge.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.copy_vma
0.78 -0.0 0.74 -0.0 0.75 +0.0 0.79 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_link.copy_vma.move_vma.__do_sys_mremap
1.12 -0.0 1.08 -0.1 1.06 +0.0 1.12 perf-profile.calltrace.cycles-pp.clear_bhb_loop.mremap
0.68 -0.0 0.65 -0.0 0.66 +0.0 0.68 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_link.copy_vma.move_vma.__do_sys_mremap
1.00 -0.0 0.97 -0.0 0.96 +0.0 1.02 perf-profile.calltrace.cycles-pp.mt_find.vma_merge.copy_vma.move_vma.__do_sys_mremap
0.62 -0.0 0.59 -0.0 0.59 -0.0 0.62 perf-profile.calltrace.cycles-pp.mas_prev_slot.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap
0.88 -0.0 0.85 -0.0 0.85 +0.0 0.88 perf-profile.calltrace.cycles-pp.userfaultfd_unmap_complete.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
1.15 -0.0 1.12 -0.1 1.08 -0.0 1.13 perf-profile.calltrace.cycles-pp.anon_vma_clone.copy_vma.move_vma.__do_sys_mremap.do_syscall_64
0.60 -0.0 0.57 ± 2% +0.0 0.62 +0.1 0.66 perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
0.59 -0.0 0.56 -0.0 0.56 -0.0 0.57 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.mremap
0.62 ± 2% -0.0 0.59 ± 2% -0.0 0.59 +0.0 0.63 perf-profile.calltrace.cycles-pp.get_old_pud.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64
0.65 -0.0 0.63 -0.0 0.63 +0.0 0.66 perf-profile.calltrace.cycles-pp.mas_update_gap.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma
0.55 -0.0 0.53 +0.0 0.58 +0.1 0.61 perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.move_ptes.move_page_tables.move_vma.__do_sys_mremap
0.74 -0.0 0.72 -0.1 0.68 ± 2% -0.0 0.71 ± 2% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.anon_vma_clone.copy_vma.move_vma.__do_sys_mremap
0.67 +0.1 0.74 +0.1 0.73 +0.0 0.68 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__madvise
0.76 +0.1 0.84 +0.1 0.82 +0.0 0.78 perf-profile.calltrace.cycles-pp.__madvise
0.66 +0.1 0.74 +0.1 0.73 +0.0 0.67 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
0.63 +0.1 0.71 +0.1 0.70 +0.0 0.64 perf-profile.calltrace.cycles-pp.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
0.62 +0.1 0.70 +0.1 0.69 +0.0 0.64 perf-profile.calltrace.cycles-pp.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
3.47 +0.1 3.55 +0.4 3.89 +0.5 3.95 perf-profile.calltrace.cycles-pp.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma
87.67 +0.8 88.47 +0.9 88.53 +0.3 88.01 perf-profile.calltrace.cycles-pp.mremap
0.00 +0.9 0.86 +0.8 0.84 +0.0 0.00 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.do_munmap
0.00 +0.9 0.88 +0.9 0.86 +0.0 0.00 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.mremap_to.__do_sys_mremap
0.00 +0.9 0.90 ± 2% +0.9 0.90 +0.0 0.00 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.move_vma
84.82 +1.0 85.80 +1.0 85.84 +0.4 85.19 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.mremap
84.66 +1.0 85.65 +1.0 85.69 +0.4 85.04 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
83.71 +1.0 84.73 +1.2 84.89 +0.5 84.18 perf-profile.calltrace.cycles-pp.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
0.00 +1.1 1.10 +1.1 1.08 +0.0 0.00 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64
0.00 +1.2 1.21 +1.2 1.20 +0.0 0.00 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to
2.09 +1.5 3.60 +1.5 3.59 +0.0 2.11 perf-profile.calltrace.cycles-pp.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.00 +1.5 1.51 +1.5 1.50 +0.0 0.00 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap
1.59 +1.5 3.12 +1.5 3.11 +0.0 1.60 perf-profile.calltrace.cycles-pp.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64
0.00 +1.6 1.62 +1.6 1.59 +0.0 0.00 perf-profile.calltrace.cycles-pp.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.00 +1.7 1.72 +1.7 1.72 +0.0 0.00 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap
0.00 +2.0 2.01 +2.0 1.99 +0.0 0.00 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64
5.34 +3.0 8.38 +3.0 8.34 +0.1 5.41 perf-profile.calltrace.cycles-pp.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap
75.13 -1.9 73.22 -1.7 73.47 +0.4 75.55 perf-profile.children.cycles-pp.move_vma
37.01 -1.6 35.43 -1.4 35.56 +0.3 37.30 perf-profile.children.cycles-pp.do_vmi_align_munmap
25.06 -1.3 23.71 -1.3 23.80 +0.0 25.06 perf-profile.children.cycles-pp.copy_vma
20.00 -1.1 18.94 -1.2 18.77 -0.2 19.81 perf-profile.children.cycles-pp.__split_vma
19.86 -1.0 18.87 -0.9 18.92 -0.0 19.84 perf-profile.children.cycles-pp.rcu_core
19.84 -1.0 18.85 -0.9 18.90 -0.0 19.82 perf-profile.children.cycles-pp.rcu_do_batch
19.88 -1.0 18.89 -0.9 18.94 -0.0 19.86 perf-profile.children.cycles-pp.handle_softirqs
10.70 ± 3% -0.9 9.84 ± 3% -0.9 9.78 ± 2% -0.4 10.33 ± 3% perf-profile.children.cycles-pp.kthread
10.70 ± 3% -0.9 9.84 ± 3% -0.9 9.78 ± 2% -0.4 10.34 ± 3% perf-profile.children.cycles-pp.ret_from_fork
10.70 ± 3% -0.9 9.84 ± 3% -0.9 9.78 ± 2% -0.4 10.34 ± 3% perf-profile.children.cycles-pp.ret_from_fork_asm
10.64 ± 3% -0.9 9.79 ± 3% -0.9 9.73 ± 2% -0.4 10.29 ± 3% perf-profile.children.cycles-pp.smpboot_thread_fn
10.63 ± 3% -0.9 9.78 ± 3% -0.9 9.72 ± 2% -0.4 10.28 ± 3% perf-profile.children.cycles-pp.run_ksoftirqd
17.53 -0.8 16.70 -0.8 16.76 +0.0 17.54 perf-profile.children.cycles-pp.kmem_cache_free
15.28 -0.8 14.47 -1.0 14.33 -0.2 15.04 perf-profile.children.cycles-pp.kmem_cache_alloc_noprof
15.16 -0.8 14.37 -0.7 14.48 +0.0 15.20 perf-profile.children.cycles-pp.vma_merge
12.18 -0.6 11.54 -0.6 11.60 +0.0 12.20 perf-profile.children.cycles-pp.mas_wr_store_entry
11.98 -0.6 11.36 -0.6 11.41 +0.0 11.98 perf-profile.children.cycles-pp.mas_store_prealloc
12.11 -0.6 11.51 -0.6 11.50 -0.1 12.02 perf-profile.children.cycles-pp.__slab_free
10.86 -0.6 10.26 -0.7 10.21 -0.1 10.75 perf-profile.children.cycles-pp.vm_area_dup
9.89 -0.5 9.40 -0.5 9.44 +0.0 9.93 perf-profile.children.cycles-pp.mas_wr_node_store
8.36 -0.4 7.92 -0.4 7.97 +0.1 8.49 perf-profile.children.cycles-pp.__memcg_slab_post_alloc_hook
7.98 -0.4 7.58 -0.3 7.68 +0.1 8.08 perf-profile.children.cycles-pp.move_page_tables
6.69 -0.4 6.33 -0.3 6.39 +0.0 6.72 perf-profile.children.cycles-pp.vma_complete
5.86 -0.3 5.56 -0.2 5.64 +0.1 5.93 perf-profile.children.cycles-pp.move_ptes
5.11 -0.3 4.81 -0.3 4.80 -0.2 4.95 perf-profile.children.cycles-pp.mas_preallocate
6.05 -0.3 5.75 -0.3 5.77 +0.0 6.07 perf-profile.children.cycles-pp.vm_area_free_rcu_cb
2.98 ± 2% -0.3 2.73 ± 4% -0.3 2.66 ± 6% -0.1 2.88 ± 3% perf-profile.children.cycles-pp.__memcpy
3.48 -0.2 3.26 -0.2 3.25 -0.0 3.45 perf-profile.children.cycles-pp.___slab_alloc
3.46 ± 2% -0.2 3.26 +0.3 3.71 ± 2% +0.5 3.92 ± 2% perf-profile.children.cycles-pp.mod_objcg_state
2.91 -0.2 2.73 -0.2 2.73 -0.1 2.79 perf-profile.children.cycles-pp.mas_alloc_nodes
2.43 -0.2 2.25 -0.2 2.27 +0.0 2.45 perf-profile.children.cycles-pp.find_vma_prev
3.47 -0.2 3.29 -0.2 3.27 ± 2% +0.0 3.50 ± 2% perf-profile.children.cycles-pp.down_write
3.46 -0.2 3.28 -0.2 3.30 +0.0 3.46 perf-profile.children.cycles-pp.flush_tlb_mm_range
4.22 -0.2 4.06 -0.3 3.91 -0.1 4.16 perf-profile.children.cycles-pp.anon_vma_clone
3.32 -0.2 3.17 -0.1 3.25 +0.1 3.42 perf-profile.children.cycles-pp.__memcg_slab_free_hook
3.35 -0.2 3.20 -0.1 3.24 +0.0 3.40 perf-profile.children.cycles-pp.mas_store_gfp
2.22 -0.1 2.07 -0.1 2.12 +0.0 2.24 perf-profile.children.cycles-pp.__cond_resched
2.05 ± 2% -0.1 1.91 -0.1 1.92 -0.0 2.04 perf-profile.children.cycles-pp.allocate_slab
3.18 -0.1 3.04 -0.1 3.11 +0.1 3.28 perf-profile.children.cycles-pp.unmap_vmas
2.24 -0.1 2.11 ± 2% -0.1 2.10 ± 3% +0.0 2.25 ± 3% perf-profile.children.cycles-pp.vma_prepare
2.12 -0.1 2.00 -0.2 1.95 -0.0 2.08 perf-profile.children.cycles-pp.__call_rcu_common
2.66 -0.1 2.53 -0.1 2.53 +0.0 2.68 perf-profile.children.cycles-pp.mtree_load
2.46 -0.1 2.34 -0.1 2.34 +0.0 2.47 perf-profile.children.cycles-pp.rcu_cblist_dequeue
2.45 ± 4% -0.1 2.33 ± 2% -0.3 2.15 ± 3% -0.2 2.28 ± 2% perf-profile.children.cycles-pp.obj_cgroup_charge
2.49 -0.1 2.38 -0.1 2.39 +0.0 2.51 perf-profile.children.cycles-pp.flush_tlb_func
8.32 -0.1 8.21 +0.2 8.52 +0.5 8.85 perf-profile.children.cycles-pp.unmap_region
2.48 -0.1 2.37 -0.0 2.44 +0.1 2.59 perf-profile.children.cycles-pp.unmap_page_range
2.23 -0.1 2.13 -0.1 2.12 +0.0 2.24 perf-profile.children.cycles-pp.native_flush_tlb_one_user
1.77 -0.1 1.67 -0.1 1.68 -0.0 1.76 perf-profile.children.cycles-pp.mas_wr_walk
1.88 -0.1 1.78 -0.1 1.80 +0.0 1.89 perf-profile.children.cycles-pp.vma_link
1.40 -0.1 1.31 -0.1 1.32 -0.0 1.40 ± 2% perf-profile.children.cycles-pp.shuffle_freelist
1.84 -0.1 1.75 -0.1 1.75 +0.0 1.85 perf-profile.children.cycles-pp.up_write
0.97 ± 2% -0.1 0.88 -0.1 0.90 ± 2% -0.0 0.94 ± 2% perf-profile.children.cycles-pp.rcu_all_qs
1.03 -0.1 0.95 -0.1 0.94 +0.0 1.04 perf-profile.children.cycles-pp.mas_prev
0.92 -0.1 0.85 -0.1 0.84 -0.0 0.92 perf-profile.children.cycles-pp.mas_prev_setup
1.58 -0.1 1.50 -0.0 1.54 +0.1 1.64 perf-profile.children.cycles-pp.zap_pmd_range
1.24 -0.1 1.17 -0.1 1.18 -0.0 1.24 perf-profile.children.cycles-pp.mas_prev_slot
1.58 -0.1 1.51 -0.1 1.52 +0.0 1.59 perf-profile.children.cycles-pp.mas_update_gap
0.62 -0.1 0.56 -0.0 0.58 -0.0 0.62 perf-profile.children.cycles-pp.security_mmap_addr
0.49 ± 2% -0.1 0.43 -0.0 0.44 ± 2% -0.0 0.46 ± 3% perf-profile.children.cycles-pp.setup_object
0.90 -0.1 0.84 -0.1 0.75 -0.1 0.78 perf-profile.children.cycles-pp.percpu_counter_add_batch
0.98 -0.1 0.92 -0.0 0.97 +0.0 1.02 perf-profile.children.cycles-pp.mas_pop_node
0.85 -0.1 0.80 -0.1 0.78 -0.0 0.84 perf-profile.children.cycles-pp._raw_spin_lock_irqsave
1.68 -0.1 1.62 -0.1 1.62 +0.0 1.68 perf-profile.children.cycles-pp.__get_unmapped_area
1.23 -0.1 1.18 +0.0 1.27 +0.1 1.34 perf-profile.children.cycles-pp.__pte_offset_map_lock
1.08 -0.1 1.03 -0.0 1.08 +0.1 1.14 perf-profile.children.cycles-pp.zap_pte_range
0.69 ± 2% -0.0 0.64 -0.0 0.67 ± 2% +0.0 0.70 perf-profile.children.cycles-pp.syscall_return_via_sysret
1.04 -0.0 1.00 -0.0 1.00 +0.0 1.08 perf-profile.children.cycles-pp.vma_to_resize
1.08 -0.0 1.04 -0.0 1.04 +0.0 1.10 perf-profile.children.cycles-pp.mas_leaf_max_gap
0.51 ± 3% -0.0 0.47 -0.0 0.47 -0.0 0.51 perf-profile.children.cycles-pp.anon_vma_interval_tree_insert
1.18 -0.0 1.14 -0.1 1.12 +0.0 1.18 perf-profile.children.cycles-pp.clear_bhb_loop
0.57 -0.0 0.53 -0.0 0.52 ± 2% -0.0 0.54 perf-profile.children.cycles-pp.mas_wr_end_piv
0.43 -0.0 0.40 -0.1 0.38 -0.0 0.41 ± 3% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
1.14 -0.0 1.10 -0.0 1.09 +0.0 1.15 perf-profile.children.cycles-pp.mt_find
0.62 -0.0 0.58 -0.0 0.58 -0.0 0.61 perf-profile.children.cycles-pp.__put_partials
0.46 ± 7% -0.0 0.42 ± 2% -0.0 0.43 -0.0 0.45 perf-profile.children.cycles-pp._raw_spin_lock
0.90 -0.0 0.87 -0.0 0.88 +0.0 0.90 perf-profile.children.cycles-pp.userfaultfd_unmap_complete
0.46 ± 3% -0.0 0.42 ± 3% -0.0 0.42 ± 2% -0.0 0.45 ± 2% perf-profile.children.cycles-pp.__alloc_pages_noprof
0.61 -0.0 0.58 -0.0 0.58 -0.0 0.60 perf-profile.children.cycles-pp.entry_SYSCALL_64
0.44 ± 3% -0.0 0.40 ± 3% -0.0 0.40 ± 2% -0.0 0.43 ± 2% perf-profile.children.cycles-pp.get_page_from_freelist
0.48 -0.0 0.45 ± 2% -0.0 0.45 -0.0 0.46 perf-profile.children.cycles-pp.mas_prev_range
0.64 -0.0 0.61 -0.0 0.61 +0.0 0.65 perf-profile.children.cycles-pp.get_old_pud
0.31 ± 2% -0.0 0.28 ± 3% -0.0 0.29 ± 2% +0.0 0.32 ± 3% perf-profile.children.cycles-pp.security_vm_enough_memory_mm
0.33 ± 3% -0.0 0.30 ± 2% -0.0 0.30 ± 2% -0.0 0.32 ± 2% perf-profile.children.cycles-pp.mas_put_in_tree
0.32 ± 2% -0.0 0.29 ± 2% -0.0 0.30 ± 3% -0.0 0.31 ± 2% perf-profile.children.cycles-pp.tlb_finish_mmu
0.47 -0.0 0.44 ± 2% -0.0 0.42 ± 2% -0.0 0.45 perf-profile.children.cycles-pp.rcu_segcblist_enqueue
0.70 ± 3% -0.0 0.68 -0.0 0.66 ± 3% -0.1 0.60 perf-profile.children.cycles-pp.__anon_vma_interval_tree_remove
0.32 ± 3% -0.0 0.30 ± 2% -0.0 0.30 -0.0 0.32 perf-profile.children.cycles-pp.free_unref_page
0.55 -0.0 0.53 -0.0 0.55 ± 2% +0.0 0.58 perf-profile.children.cycles-pp.refill_obj_stock
0.33 -0.0 0.31 -0.0 0.32 +0.0 0.33 perf-profile.children.cycles-pp.mas_destroy
0.25 ± 4% -0.0 0.23 ± 3% -0.0 0.23 ± 3% -0.0 0.25 ± 2% perf-profile.children.cycles-pp.rmqueue
0.35 -0.0 0.34 -0.0 0.34 +0.0 0.36 perf-profile.children.cycles-pp.__rb_insert_augmented
0.39 -0.0 0.37 -0.0 0.36 ± 2% -0.0 0.38 perf-profile.children.cycles-pp.down_write_killable
0.22 ± 4% -0.0 0.20 ± 3% -0.0 0.20 ± 3% -0.0 0.22 ± 3% perf-profile.children.cycles-pp.__rmqueue_pcplist
0.21 ± 4% -0.0 0.19 ± 3% -0.0 0.19 ± 3% -0.0 0.20 ± 3% perf-profile.children.cycles-pp.rmqueue_bulk
0.52 -0.0 0.51 ± 2% +0.1 0.59 +0.1 0.64 perf-profile.children.cycles-pp.__pte_offset_map
0.30 ± 2% -0.0 0.28 ± 2% -0.1 0.23 ± 3% -0.0 0.25 ± 3% perf-profile.children.cycles-pp.__vm_enough_memory
0.26 -0.0 0.24 ± 2% -0.0 0.21 -0.0 0.22 perf-profile.children.cycles-pp.syscall_exit_to_user_mode
0.28 ± 2% -0.0 0.27 ± 2% -0.0 0.26 -0.0 0.28 ± 2% perf-profile.children.cycles-pp.free_unref_page_commit
0.29 -0.0 0.27 -0.0 0.27 ± 2% +0.0 0.29 ± 2% perf-profile.children.cycles-pp.tlb_gather_mmu
0.16 ± 2% -0.0 0.14 ± 3% -0.0 0.14 ± 2% -0.0 0.14 ± 3% perf-profile.children.cycles-pp.mas_wr_append
0.28 ± 2% -0.0 0.26 +0.0 0.32 +0.1 0.33 ± 2% perf-profile.children.cycles-pp.khugepaged_enter_vma
0.32 -0.0 0.30 -0.0 0.30 -0.0 0.32 ± 2% perf-profile.children.cycles-pp.mas_wr_store_setup
0.09 ± 4% -0.0 0.08 ± 5% -0.0 0.06 ± 6% -0.0 0.07 perf-profile.children.cycles-pp.vma_dup_policy
0.43 -0.0 0.42 -0.0 0.41 +0.0 0.43 perf-profile.children.cycles-pp.mremap_userfaultfd_complete
0.13 ± 6% -0.0 0.12 ± 11% -0.0 0.10 ± 4% +0.0 0.13 ± 9% perf-profile.children.cycles-pp.vm_stat_account
0.36 -0.0 0.35 -0.0 0.35 +0.0 0.37 perf-profile.children.cycles-pp.madvise_vma_behavior
0.18 ± 2% -0.0 0.17 ± 2% -0.0 0.16 ± 2% +0.0 0.18 ± 2% perf-profile.children.cycles-pp.__free_one_page
0.16 ± 3% -0.0 0.15 ± 3% -0.0 0.12 -0.0 0.13 ± 3% perf-profile.children.cycles-pp.x64_sys_call
0.15 ± 3% -0.0 0.14 ± 3% -0.0 0.13 ± 2% -0.0 0.14 ± 2% perf-profile.children.cycles-pp.flush_tlb_batched_pending
0.15 ± 2% -0.0 0.14 ± 3% +0.0 0.19 ± 2% +0.1 0.20 ± 2% perf-profile.children.cycles-pp.mas_node_count_gfp
0.24 ± 2% +0.0 0.24 ± 3% +0.0 0.24 ± 2% +0.0 0.27 ± 6% perf-profile.children.cycles-pp.lru_add_drain
0.07 +0.0 0.07 ± 6% -0.0 0.05 -0.0 0.05 ± 9% perf-profile.children.cycles-pp.__x64_sys_mremap
0.14 ± 3% +0.0 0.15 ± 2% +0.0 0.14 ± 5% +0.0 0.14 ± 2% perf-profile.children.cycles-pp.free_pgd_range
0.08 ± 4% +0.0 0.10 ± 4% +0.0 0.08 +0.0 0.08 perf-profile.children.cycles-pp.mm_get_unmapped_area_vmflags
0.78 +0.1 0.85 +0.1 0.84 +0.0 0.79 perf-profile.children.cycles-pp.__madvise
0.63 +0.1 0.71 +0.1 0.70 +0.0 0.64 perf-profile.children.cycles-pp.__x64_sys_madvise
0.63 +0.1 0.70 +0.1 0.70 +0.0 0.64 perf-profile.children.cycles-pp.do_madvise
3.52 +0.1 3.60 +0.4 3.97 +0.5 4.03 perf-profile.children.cycles-pp.free_pgtables
0.00 +0.1 0.09 +0.1 0.09 ± 3% +0.0 0.00 perf-profile.children.cycles-pp.can_modify_mm_madv
1.30 +0.2 1.46 +0.2 1.48 +0.0 1.32 perf-profile.children.cycles-pp.mas_next_slot
88.06 +0.8 88.84 +0.9 88.91 +0.3 88.40 perf-profile.children.cycles-pp.mremap
83.81 +1.0 84.84 +1.2 84.99 +0.5 84.28 perf-profile.children.cycles-pp.__do_sys_mremap
85.98 +1.0 87.02 +1.1 87.07 +0.4 86.38 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
85.50 +1.1 86.56 +1.1 86.60 +0.4 85.89 perf-profile.children.cycles-pp.do_syscall_64
2.12 +1.5 3.62 +1.5 3.61 +0.0 2.13 perf-profile.children.cycles-pp.do_munmap
40.41 +1.5 41.93 +1.6 42.04 +0.3 40.75 perf-profile.children.cycles-pp.do_vmi_munmap
3.62 +2.4 5.98 +2.3 5.93 +0.0 3.65 perf-profile.children.cycles-pp.mas_walk
5.40 +3.0 8.44 +3.0 8.41 +0.1 5.47 perf-profile.children.cycles-pp.mremap_to
5.26 +3.2 8.48 +3.2 8.44 +0.1 5.31 perf-profile.children.cycles-pp.mas_find
0.00 +5.5 5.46 +5.4 5.42 +0.0 0.00 perf-profile.children.cycles-pp.can_modify_mm
11.49 -0.6 10.92 -0.6 10.92 -0.1 11.41 perf-profile.self.cycles-pp.__slab_free
4.32 -0.2 4.07 -1.1 3.26 ± 2% -0.9 3.46 perf-profile.self.cycles-pp.__memcg_slab_post_alloc_hook
1.96 -0.2 1.80 ± 4% -0.2 1.75 ± 6% -0.1 1.89 ± 3% perf-profile.self.cycles-pp.__memcpy
2.36 ± 2% -0.1 2.24 ± 2% -0.1 2.22 ± 3% +0.0 2.38 ± 2% perf-profile.self.cycles-pp.down_write
2.42 -0.1 2.30 -0.1 2.31 +0.0 2.44 perf-profile.self.cycles-pp.rcu_cblist_dequeue
2.33 -0.1 2.22 -0.1 2.21 -0.0 2.32 perf-profile.self.cycles-pp.mtree_load
2.21 -0.1 2.10 -0.1 2.10 +0.0 2.22 perf-profile.self.cycles-pp.native_flush_tlb_one_user
2.04 ± 5% -0.1 1.95 ± 3% -0.2 1.80 ± 3% -0.1 1.90 ± 3% perf-profile.self.cycles-pp.obj_cgroup_charge
1.62 -0.1 1.54 -0.1 1.55 +0.0 1.63 ± 2% perf-profile.self.cycles-pp.__memcg_slab_free_hook
1.52 -0.1 1.44 -0.1 1.45 -0.0 1.50 perf-profile.self.cycles-pp.mas_wr_walk
1.15 ± 2% -0.1 1.07 -0.1 1.08 -0.0 1.14 ± 2% perf-profile.self.cycles-pp.shuffle_freelist
1.53 -0.1 1.45 -0.1 1.46 +0.0 1.53 perf-profile.self.cycles-pp.up_write
1.44 -0.1 1.36 -0.1 1.33 -0.0 1.41 perf-profile.self.cycles-pp.__call_rcu_common
0.70 ± 2% -0.1 0.62 -0.1 0.64 ± 3% -0.0 0.67 ± 2% perf-profile.self.cycles-pp.rcu_all_qs
1.72 -0.1 1.66 +1.0 2.68 ± 2% +1.1 2.84 perf-profile.self.cycles-pp.mod_objcg_state
0.51 ± 3% -0.1 0.45 -0.0 0.47 -0.0 0.50 perf-profile.self.cycles-pp.security_mmap_addr
2.52 -0.1 2.46 -0.2 2.36 -0.2 2.33 perf-profile.self.cycles-pp.kmem_cache_alloc_noprof
0.94 ± 2% -0.1 0.88 ± 4% -0.1 0.88 ± 3% -0.0 0.92 ± 5% perf-profile.self.cycles-pp.vm_area_dup
1.18 -0.1 1.12 -0.1 1.12 -0.0 1.18 perf-profile.self.cycles-pp.vma_merge
0.89 -0.1 0.83 -0.1 0.83 -0.0 0.88 perf-profile.self.cycles-pp.___slab_alloc
1.38 -0.1 1.33 -0.0 1.34 +0.0 1.39 perf-profile.self.cycles-pp.do_vmi_align_munmap
0.62 -0.1 0.56 ± 2% -0.1 0.56 -0.0 0.59 perf-profile.self.cycles-pp.mremap
1.00 -0.1 0.95 -0.1 0.94 -0.0 0.97 perf-profile.self.cycles-pp.mas_preallocate
0.98 -0.1 0.93 -0.0 0.94 -0.0 0.98 perf-profile.self.cycles-pp.move_ptes
0.99 -0.1 0.94 -0.0 0.94 -0.0 0.99 perf-profile.self.cycles-pp.mas_prev_slot
1.09 -0.0 1.04 ± 2% -0.0 1.07 +0.0 1.14 perf-profile.self.cycles-pp.__cond_resched
0.94 -0.0 0.90 -0.1 0.88 -0.0 0.94 perf-profile.self.cycles-pp.vm_area_free_rcu_cb
0.85 -0.0 0.80 -0.0 0.84 +0.0 0.88 perf-profile.self.cycles-pp.mas_pop_node
0.77 -0.0 0.72 -0.1 0.64 -0.1 0.66 perf-profile.self.cycles-pp.percpu_counter_add_batch
0.68 -0.0 0.63 -0.1 0.62 -0.0 0.66 perf-profile.self.cycles-pp.__split_vma
1.17 -0.0 1.13 -0.1 1.11 +0.0 1.17 perf-profile.self.cycles-pp.clear_bhb_loop
0.95 -0.0 0.91 -0.0 0.91 +0.0 0.95 perf-profile.self.cycles-pp.mas_leaf_max_gap
0.79 -0.0 0.75 -0.0 0.77 +0.0 0.80 perf-profile.self.cycles-pp.mas_wr_store_entry
0.44 -0.0 0.40 -0.0 0.41 +0.0 0.44 perf-profile.self.cycles-pp.do_munmap
1.22 -0.0 1.18 -0.0 1.19 +0.0 1.22 perf-profile.self.cycles-pp.move_vma
0.45 -0.0 0.42 -0.0 0.41 -0.0 0.43 perf-profile.self.cycles-pp.mas_wr_end_piv
0.89 -0.0 0.86 -0.0 0.87 +0.0 0.90 perf-profile.self.cycles-pp.mas_store_gfp
0.43 ± 2% -0.0 0.40 -0.1 0.38 -0.0 0.41 ± 3% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
0.78 -0.0 0.75 -0.0 0.76 +0.0 0.79 perf-profile.self.cycles-pp.userfaultfd_unmap_complete
0.66 -0.0 0.63 -0.0 0.63 -0.0 0.66 perf-profile.self.cycles-pp.mas_store_prealloc
1.49 -0.0 1.46 -0.0 1.45 ± 2% +0.0 1.50 perf-profile.self.cycles-pp.kmem_cache_free
0.60 -0.0 0.58 -0.0 0.58 +0.0 0.61 perf-profile.self.cycles-pp.unmap_region
0.86 -0.0 0.83 -0.0 0.84 +0.0 0.88 perf-profile.self.cycles-pp.move_page_tables
0.43 ± 4% -0.0 0.40 -0.0 0.40 -0.0 0.42 perf-profile.self.cycles-pp.anon_vma_interval_tree_insert
0.99 -0.0 0.97 -0.0 0.95 +0.0 1.00 perf-profile.self.cycles-pp.mt_find
0.71 -0.0 0.68 -0.0 0.67 -0.0 0.69 perf-profile.self.cycles-pp.unmap_page_range
0.36 ± 3% -0.0 0.33 ± 2% -0.0 0.34 ± 3% +0.0 0.36 ± 2% perf-profile.self.cycles-pp.syscall_return_via_sysret
0.55 -0.0 0.52 -0.0 0.52 +0.0 0.55 perf-profile.self.cycles-pp.get_old_pud
0.49 -0.0 0.47 -0.0 0.47 +0.0 0.49 perf-profile.self.cycles-pp.find_vma_prev
0.27 -0.0 0.25 -0.0 0.25 -0.0 0.26 ± 2% perf-profile.self.cycles-pp.mas_prev_setup
0.41 -0.0 0.39 -0.0 0.39 +0.0 0.42 perf-profile.self.cycles-pp._raw_spin_lock_irqsave
0.61 -0.0 0.58 -0.0 0.59 +0.0 0.62 perf-profile.self.cycles-pp.copy_vma
0.37 ± 6% -0.0 0.35 ± 2% -0.0 0.36 -0.0 0.37 perf-profile.self.cycles-pp._raw_spin_lock
0.47 -0.0 0.45 ± 2% -0.0 0.46 -0.0 0.47 perf-profile.self.cycles-pp.flush_tlb_mm_range
0.42 ± 2% -0.0 0.40 ± 2% -0.0 0.38 ± 2% -0.0 0.41 perf-profile.self.cycles-pp.rcu_segcblist_enqueue
0.27 -0.0 0.25 ± 2% -0.0 0.24 ± 2% -0.0 0.26 ± 2% perf-profile.self.cycles-pp.mas_put_in_tree
0.44 -0.0 0.42 -0.0 0.42 +0.0 0.44 perf-profile.self.cycles-pp.mas_update_gap
0.39 -0.0 0.37 -0.0 0.38 -0.0 0.39 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
0.49 -0.0 0.47 +0.0 0.50 ± 2% +0.0 0.52 perf-profile.self.cycles-pp.refill_obj_stock
0.27 ± 2% -0.0 0.25 ± 2% -0.0 0.26 -0.0 0.27 perf-profile.self.cycles-pp.tlb_finish_mmu
0.34 -0.0 0.32 -0.0 0.32 -0.0 0.33 perf-profile.self.cycles-pp.zap_pmd_range
0.48 -0.0 0.46 -0.0 0.48 +0.0 0.49 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
0.58 ± 2% -0.0 0.56 -0.0 0.54 ± 3% -0.1 0.48 perf-profile.self.cycles-pp.__anon_vma_interval_tree_remove
0.28 -0.0 0.26 -0.0 0.27 +0.0 0.28 ± 2% perf-profile.self.cycles-pp.mas_alloc_nodes
0.24 ± 2% -0.0 0.22 -0.0 0.22 +0.0 0.24 ± 2% perf-profile.self.cycles-pp.mas_prev
0.14 ± 3% -0.0 0.12 ± 2% -0.0 0.12 -0.0 0.12 perf-profile.self.cycles-pp.syscall_exit_to_user_mode
0.52 -0.0 0.51 -0.0 0.51 +0.0 0.55 perf-profile.self.cycles-pp.mremap_to
0.26 -0.0 0.24 -0.0 0.24 -0.0 0.26 perf-profile.self.cycles-pp.__rb_insert_augmented
0.40 -0.0 0.39 -0.0 0.39 +0.0 0.41 ± 2% perf-profile.self.cycles-pp.__pte_offset_map_lock
0.38 -0.0 0.37 -0.0 0.36 -0.0 0.38 perf-profile.self.cycles-pp.mremap_userfaultfd_complete
0.28 -0.0 0.26 ± 3% -0.0 0.26 -0.0 0.27 ± 2% perf-profile.self.cycles-pp.mas_prev_range
0.33 ± 2% -0.0 0.32 -0.0 0.31 -0.0 0.33 ± 2% perf-profile.self.cycles-pp.zap_pte_range
0.28 -0.0 0.26 -0.0 0.27 +0.0 0.28 perf-profile.self.cycles-pp.flush_tlb_func
0.22 -0.0 0.21 ± 2% -0.0 0.20 ± 2% -0.0 0.21 perf-profile.self.cycles-pp.entry_SYSCALL_64
0.10 -0.0 0.09 -0.0 0.09 ± 3% -0.0 0.10 ± 3% perf-profile.self.cycles-pp.mod_node_page_state
0.17 -0.0 0.16 -0.0 0.17 ± 2% +0.0 0.17 perf-profile.self.cycles-pp.__thp_vma_allowable_orders
0.44 -0.0 0.42 ± 2% +0.1 0.50 +0.1 0.54 perf-profile.self.cycles-pp.__pte_offset_map
0.06 -0.0 0.05 -0.1 0.00 -0.0 0.02 ±129% perf-profile.self.cycles-pp.vma_dup_policy
0.13 ± 3% -0.0 0.12 ± 3% -0.0 0.09 -0.0 0.09 ± 5% perf-profile.self.cycles-pp.x64_sys_call
0.31 -0.0 0.30 -0.0 0.29 -0.0 0.29 perf-profile.self.cycles-pp.unmap_vmas
0.10 ± 10% -0.0 0.09 ± 12% -0.0 0.08 ± 5% +0.0 0.10 ± 12% perf-profile.self.cycles-pp.vm_stat_account
0.08 ± 5% -0.0 0.07 ± 4% +0.0 0.11 ± 3% +0.0 0.12 ± 3% perf-profile.self.cycles-pp.mas_node_count_gfp
0.22 -0.0 0.21 ± 2% -0.0 0.20 -0.0 0.21 ± 2% perf-profile.self.cycles-pp.do_syscall_64
0.11 -0.0 0.10 ± 4% -0.0 0.10 +0.0 0.11 perf-profile.self.cycles-pp.security_vm_enough_memory_mm
0.08 -0.0 0.08 ± 5% -0.0 0.08 ± 4% +0.0 0.09 perf-profile.self.cycles-pp.__vm_enough_memory
0.07 +0.0 0.07 +0.0 0.08 +0.0 0.09 ± 3% perf-profile.self.cycles-pp.khugepaged_enter_vma
0.15 ± 3% +0.0 0.16 ± 3% +0.0 0.16 ± 3% +0.0 0.17 ± 2% perf-profile.self.cycles-pp.vma_to_resize
0.56 +0.0 0.57 -0.0 0.53 -0.0 0.53 perf-profile.self.cycles-pp.__do_sys_mremap
0.06 ± 5% +0.0 0.07 +0.0 0.06 +0.0 0.06 perf-profile.self.cycles-pp.mm_get_unmapped_area_vmflags
0.11 ± 4% +0.0 0.12 ± 4% -0.0 0.11 ± 3% +0.0 0.12 ± 3% perf-profile.self.cycles-pp.free_pgd_range
0.21 +0.0 0.22 ± 2% -0.0 0.21 ± 2% +0.0 0.22 ± 2% perf-profile.self.cycles-pp.thp_get_unmapped_area_vmflags
0.45 +0.0 0.48 +0.0 0.48 -0.0 0.44 perf-profile.self.cycles-pp.do_vmi_munmap
0.27 +0.0 0.32 +0.3 0.60 +0.4 0.62 perf-profile.self.cycles-pp.free_pgtables
0.36 ± 2% +0.1 0.44 +0.0 0.37 ± 2% -0.0 0.35 perf-profile.self.cycles-pp.unlink_anon_vmas
1.06 +0.1 1.19 +0.1 1.20 +0.0 1.08 perf-profile.self.cycles-pp.mas_next_slot
1.49 +0.5 2.01 +0.5 1.98 +0.0 1.50 perf-profile.self.cycles-pp.mas_find
0.00 +1.4 1.38 +1.4 1.38 +0.0 0.00 perf-profile.self.cycles-pp.can_modify_mm
3.15 +2.1 5.23 +2.0 5.19 +0.0 3.16 perf-profile.self.cycles-pp.mas_walk
>
> For everyone: Apologies if you're in the CC list and I didn't CC you,
> but I tried to keep my patch set's CC list relatively short and clean
> (and I focused on the active participants).
> Everyone's comments are very welcome.
>
> [1]: https://lore.kernel.org/all/20240806212808.1885309-1-pedro.falcato@gmail.com/
> --
> Pedro
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression
2024-08-06 14:43 ` Linus Torvalds
@ 2024-08-07 12:26 ` Michael Ellerman
0 siblings, 0 replies; 29+ messages in thread
From: Michael Ellerman @ 2024-08-07 12:26 UTC (permalink / raw)
To: Linus Torvalds
Cc: Jeff Xu, Nicholas Piggin, Christophe Leroy, Pedro Falcato,
kernel test robot, Jeff Xu, oe-lkp, lkp, linux-kernel,
Andrew Morton, Kees Cook, Liam R. Howlett, Dave Hansen,
Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jonathan Corbet,
Jorge Lucangeli Obes, Matthew Wilcox, Muhammad Usama Anjum,
Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany,
Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang,
feng.tang, fengwei.yin
Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Tue, 6 Aug 2024 at 05:03, Michael Ellerman <mpe@ellerman.id.au> wrote:
>>
>> Or should I turn it into a series and post it?
>
> I think post it as a single working patch rather than as a series that
> breaks things and then fixes it.
It splits nicely with no breakage along the way.
> And considering that you did all the testing and found the problems,
> just take ownership of it and make it a "Suggested-by: Linus" or
> something.
Sure.
cheers
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression
2024-08-06 2:01 ` Michael Ellerman
2024-08-06 2:15 ` Linus Torvalds
@ 2024-09-13 5:47 ` Christophe Leroy
1 sibling, 0 replies; 29+ messages in thread
From: Christophe Leroy @ 2024-09-13 5:47 UTC (permalink / raw)
To: Michael Ellerman, Linus Torvalds, Nicholas Piggin
Cc: Jeff Xu, Pedro Falcato, kernel test robot, Jeff Xu, oe-lkp, lkp,
linux-kernel, Andrew Morton, Kees Cook, Liam R. Howlett,
Dave Hansen, Greg Kroah-Hartman, Guenter Roeck, Jann Horn,
Jonathan Corbet, Jorge Lucangeli Obes, Matthew Wilcox,
Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan,
Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api,
linux-mm, ying.huang, feng.tang, fengwei.yin
Le 06/08/2024 à 04:01, Michael Ellerman a écrit :
> Linus Torvalds <torvalds@linux-foundation.org> writes:
>> On Mon, 5 Aug 2024 at 16:25, Nicholas Piggin <npiggin@gmail.com> wrote:
>>>
>>> Can userspace on other archs not unmap their vdsos?
>>
>> I think they can, and nobody cares. The "context.vdso" value stays at
>> some stale value, and anybody who tries to use it will just fail.
>>
>> So what makes powerpc special is not "you can unmap the vdso", but
>> "powerpc cares".
>>
>> I just don't quite know _why_ powerpc cares.
>
> AFAIK for CRIU the problem is signal delivery:
>
> arch/powerpc/kernel/signal_64.c:
>
> int handle_rt_signal64(struct ksignal *ksig, sigset_t *set,
> struct task_struct *tsk)
> {
> ...
> /* Set up to return from userspace. */
> if (tsk->mm->context.vdso) {
> regs_set_return_ip(regs, VDSO64_SYMBOL(tsk->mm->context.vdso, sigtramp_rt64));
>
>
> ie. if the VDSO is moved but mm->context.vdso is not updated, signal
> delivery will crash in userspace.
>
> x86-64 always uses SA_RESTORER, and arm64 & s390 can use SA_RESTORER, so
> I think CRIU uses that to avoid problems with signal delivery when the
> VDSO is moved.
>
> riscv doesn't support SA_RESTORER but I guess CRIU doesn't support riscv
> yet so it's not become a problem.
>
> There was a patch to support SA_RESTORER on powerpc, but I balked at
> merging it because I couldn't find anyone on the glibc side to say
> whether they wanted it or not. I guess I should have just merged it.
The patch is at
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/afe50d1db63a10fde9547ea08fe1fa68b0638aba.1624618157.git.christophe.leroy@csgroup.eu/
It still applies cleanly.
Christophe
>
> There was an attempt to unify all the vdso stuff and handle the
> VDSO mremap case in generic code:
>
> https://lore.kernel.org/lkml/20210611180242.711399-17-dima@arista.com/
>
> But I think that series got a bit big and complicated and Dmitry had to
> move on to other things.
>
> cheers
^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2024-09-13 5:47 UTC | newest]
Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-04 8:59 [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression kernel test robot
2024-08-04 20:32 ` Linus Torvalds
2024-08-05 13:33 ` Pedro Falcato
2024-08-05 18:10 ` Jeff Xu
2024-08-05 18:55 ` Linus Torvalds
2024-08-05 19:33 ` Linus Torvalds
2024-08-06 2:14 ` Michael Ellerman
2024-08-06 2:17 ` Linus Torvalds
2024-08-06 12:03 ` Michael Ellerman
2024-08-06 14:43 ` Linus Torvalds
2024-08-07 12:26 ` Michael Ellerman
2024-08-06 6:04 ` Oliver Sang
2024-08-06 14:38 ` Linus Torvalds
2024-08-06 21:37 ` Pedro Falcato
2024-08-07 5:54 ` Oliver Sang
2024-08-05 19:37 ` Jeff Xu
2024-08-05 19:48 ` Linus Torvalds
2024-08-05 19:50 ` Linus Torvalds
2024-08-05 23:24 ` Nicholas Piggin
2024-08-06 0:13 ` Linus Torvalds
2024-08-06 1:22 ` Jeff Xu
2024-08-06 2:01 ` Michael Ellerman
2024-08-06 2:15 ` Linus Torvalds
2024-09-13 5:47 ` Christophe Leroy
2024-08-05 17:54 ` Jeff Xu
2024-08-05 13:56 ` Jeff Xu
2024-08-05 16:58 ` Jeff Xu
2024-08-06 1:44 ` Oliver Sang
2024-08-06 14:54 ` Jeff Xu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).