* [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression @ 2024-08-04 8:59 kernel test robot 2024-08-04 20:32 ` Linus Torvalds ` (2 more replies) 0 siblings, 3 replies; 29+ messages in thread From: kernel test robot @ 2024-08-04 8:59 UTC (permalink / raw) To: Jeff Xu Cc: oe-lkp, lkp, linux-kernel, Andrew Morton, Kees Cook, Liam R. Howlett, Pedro Falcato, Dave Hansen, Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jeff Xu, Jonathan Corbet, Jorge Lucangeli Obes, Linus Torvalds, Matthew Wilcox, Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang, feng.tang, fengwei.yin, oliver.sang Hello, kernel test robot noticed a -4.4% regression of stress-ng.pagemove.page_remaps_per_sec on: commit: 8be7258aad44b5e25977a98db136f677fa6f4370 ("mseal: add mseal syscall") https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master testcase: stress-ng test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory parameters: nr_threads: 100% testtime: 60s test: pagemove cpufreq_governor: performance In addition to that, the commit also has significant impact on the following tests: +------------------+---------------------------------------------------------------------------------------------+ | testcase: change | stress-ng: stress-ng.pkey.ops_per_sec -3.6% regression | | test machine | 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory | | test parameters | cpufreq_governor=performance | | | nr_threads=100% | | | test=pkey | | | testtime=60s | +------------------+---------------------------------------------------------------------------------------------+ If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <oliver.sang@intel.com> | Closes: https://lore.kernel.org/oe-lkp/202408041602.caa0372-oliver.sang@intel.com Details are as below: --------------------------------------------------------------------------------------------------> The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20240804/202408041602.caa0372-oliver.sang@intel.com ========================================================================================= compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: gcc-13/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/pagemove/stress-ng/60s commit: ff388fe5c4 ("mseal: wire up mseal syscall") 8be7258aad ("mseal: add mseal syscall") ff388fe5c481d39c 8be7258aad44b5e25977a98db13 ---------------- --------------------------- %stddev %change %stddev \ | \ 41625945 -4.3% 39842322 proc-vmstat.numa_hit 41559175 -4.3% 39774160 proc-vmstat.numa_local 77484314 -4.4% 74105555 proc-vmstat.pgalloc_normal 77205752 -4.4% 73826672 proc-vmstat.pgfree 18361466 -4.2% 17596652 stress-ng.pagemove.ops 306014 -4.2% 293262 stress-ng.pagemove.ops_per_sec 205312 -4.4% 196176 stress-ng.pagemove.page_remaps_per_sec 4961 +1.0% 5013 stress-ng.time.percent_of_cpu_this_job_got 2917 +1.2% 2952 stress-ng.time.system_time 1.07 -6.6% 1.00 perf-stat.i.MPKI 3.354e+10 +3.5% 3.473e+10 perf-stat.i.branch-instructions 1.795e+08 -4.2% 1.719e+08 perf-stat.i.cache-misses 2.376e+08 -4.1% 2.279e+08 perf-stat.i.cache-references 1.13 -3.0% 1.10 perf-stat.i.cpi 1077 +4.3% 1124 perf-stat.i.cycles-between-cache-misses 1.717e+11 +2.7% 1.762e+11 perf-stat.i.instructions 0.88 +3.1% 0.91 perf-stat.i.ipc 1.05 -6.8% 0.97 perf-stat.overall.MPKI 0.25 ± 2% -0.0 0.24 perf-stat.overall.branch-miss-rate% 1.13 -3.0% 1.10 perf-stat.overall.cpi 1084 +4.0% 1127 perf-stat.overall.cycles-between-cache-misses 0.88 +3.1% 0.91 perf-stat.overall.ipc 3.298e+10 +3.5% 3.415e+10 perf-stat.ps.branch-instructions 1.764e+08 -4.3% 1.689e+08 perf-stat.ps.cache-misses 2.336e+08 -4.1% 2.24e+08 perf-stat.ps.cache-references 194.57 -2.4% 189.96 ± 2% perf-stat.ps.cpu-migrations 1.688e+11 +2.7% 1.733e+11 perf-stat.ps.instructions 1.036e+13 +3.0% 1.068e+13 perf-stat.total.instructions 75.12 -1.9 73.22 perf-profile.calltrace.cycles-pp.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 36.84 -1.6 35.29 perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 24.90 -1.2 23.72 perf-profile.calltrace.cycles-pp.copy_vma.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 19.89 -0.9 18.98 perf-profile.calltrace.cycles-pp.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 10.56 ± 2% -0.8 9.78 ± 2% perf-profile.calltrace.cycles-pp.rcu_core.handle_softirqs.run_ksoftirqd.smpboot_thread_fn.kthread 10.56 ± 2% -0.8 9.79 ± 2% perf-profile.calltrace.cycles-pp.handle_softirqs.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork 10.56 ± 2% -0.8 9.79 ± 2% perf-profile.calltrace.cycles-pp.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm 10.57 ± 2% -0.8 9.80 ± 2% perf-profile.calltrace.cycles-pp.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm 10.52 ± 2% -0.8 9.75 ± 2% perf-profile.calltrace.cycles-pp.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd.smpboot_thread_fn 10.62 ± 2% -0.8 9.85 ± 2% perf-profile.calltrace.cycles-pp.kthread.ret_from_fork.ret_from_fork_asm 10.62 ± 2% -0.8 9.85 ± 2% perf-profile.calltrace.cycles-pp.ret_from_fork.ret_from_fork_asm 10.62 ± 2% -0.8 9.85 ± 2% perf-profile.calltrace.cycles-pp.ret_from_fork_asm 14.75 -0.7 14.07 perf-profile.calltrace.cycles-pp.vma_merge.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 1.50 -0.6 0.94 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 5.88 ± 2% -0.4 5.47 ± 2% perf-profile.calltrace.cycles-pp.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd 7.80 -0.3 7.47 perf-profile.calltrace.cycles-pp.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 4.55 ± 2% -0.3 4.24 ± 2% perf-profile.calltrace.cycles-pp.__slab_free.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs 6.76 -0.3 6.45 perf-profile.calltrace.cycles-pp.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 6.15 -0.3 5.86 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap 8.22 -0.3 7.93 perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 6.12 -0.3 5.87 perf-profile.calltrace.cycles-pp.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 5.74 -0.2 5.50 perf-profile.calltrace.cycles-pp.move_ptes.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64 3.16 ± 2% -0.2 2.94 perf-profile.calltrace.cycles-pp.vm_area_free_rcu_cb.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd 5.50 -0.2 5.28 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap 1.36 -0.2 1.14 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap 5.15 -0.2 4.94 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap 5.51 -0.2 5.31 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_merge.copy_vma.move_vma.__do_sys_mremap 5.16 -0.2 4.97 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma.move_vma 2.24 -0.2 2.05 perf-profile.calltrace.cycles-pp.find_vma_prev.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 2.60 ± 2% -0.2 2.42 ± 2% perf-profile.calltrace.cycles-pp.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core.handle_softirqs 4.67 -0.2 4.49 perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma 3.41 -0.2 3.23 perf-profile.calltrace.cycles-pp.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma 3.00 -0.2 2.83 ± 2% perf-profile.calltrace.cycles-pp.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 0.96 -0.2 0.80 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.do_munmap.mremap_to 4.04 -0.2 3.88 perf-profile.calltrace.cycles-pp.vm_area_dup.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 3.20 ± 2% -0.2 3.04 ± 2% perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap 3.53 -0.1 3.38 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma.__do_sys_mremap 3.40 -0.1 3.26 perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma.__do_sys_mremap 2.20 ± 2% -0.1 2.06 ± 2% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap 1.84 ± 3% -0.1 1.71 ± 3% perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.anon_vma_clone.__split_vma.do_vmi_align_munmap 1.78 ± 2% -0.1 1.65 ± 3% perf-profile.calltrace.cycles-pp.vma_prepare.vma_merge.copy_vma.move_vma.__do_sys_mremap 2.69 -0.1 2.56 perf-profile.calltrace.cycles-pp.unlink_anon_vmas.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap 1.78 ± 2% -0.1 1.66 ± 2% perf-profile.calltrace.cycles-pp.__slab_free.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core 1.36 ± 2% -0.1 1.23 ± 2% perf-profile.calltrace.cycles-pp.rcu_cblist_dequeue.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd 0.95 -0.1 0.83 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap 3.29 -0.1 3.17 perf-profile.calltrace.cycles-pp.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 2.08 -0.1 1.96 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_merge.copy_vma.move_vma.__do_sys_mremap 1.43 ± 3% -0.1 1.32 ± 3% perf-profile.calltrace.cycles-pp.down_write.vma_prepare.vma_merge.copy_vma.move_vma 2.21 -0.1 2.10 perf-profile.calltrace.cycles-pp.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 2.47 -0.1 2.36 perf-profile.calltrace.cycles-pp.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma 2.21 -0.1 2.12 perf-profile.calltrace.cycles-pp.native_flush_tlb_one_user.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables 1.41 -0.1 1.32 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap 1.26 -0.1 1.18 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap 1.82 -0.1 1.75 perf-profile.calltrace.cycles-pp.vma_link.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 0.71 -0.1 0.63 perf-profile.calltrace.cycles-pp.mas_prev.vma_merge.copy_vma.move_vma.__do_sys_mremap 1.29 -0.1 1.22 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma 0.61 -0.1 0.54 perf-profile.calltrace.cycles-pp.mas_prev_setup.mas_prev.vma_merge.copy_vma.move_vma 1.36 -0.1 1.29 perf-profile.calltrace.cycles-pp.kmem_cache_free.unlink_anon_vmas.free_pgtables.unmap_region.do_vmi_align_munmap 1.40 -0.1 1.33 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma.move_vma 0.70 -0.1 0.64 perf-profile.calltrace.cycles-pp.mas_walk.find_vma_prev.copy_vma.move_vma.__do_sys_mremap 1.23 -0.1 1.17 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma 1.66 -0.1 1.60 perf-profile.calltrace.cycles-pp.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 1.16 -0.1 1.10 perf-profile.calltrace.cycles-pp.anon_vma_clone.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 0.96 -0.1 0.90 perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.unlink_anon_vmas.free_pgtables.unmap_region 1.14 -0.1 1.08 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap 0.79 -0.1 0.74 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma 1.04 -0.1 1.00 perf-profile.calltrace.cycles-pp.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.58 -0.0 0.53 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.mremap 0.61 -0.0 0.56 perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core 0.56 -0.0 0.52 perf-profile.calltrace.cycles-pp.mas_find.find_vma_prev.copy_vma.move_vma.__do_sys_mremap 0.57 -0.0 0.53 ± 2% perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs 0.78 -0.0 0.74 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge 0.88 -0.0 0.84 perf-profile.calltrace.cycles-pp.mtree_load.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64 0.70 -0.0 0.66 perf-profile.calltrace.cycles-pp.__call_rcu_common.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 0.68 -0.0 0.64 perf-profile.calltrace.cycles-pp.percpu_counter_add_batch.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 0.68 -0.0 0.64 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.mremap 0.97 -0.0 0.93 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 1.11 -0.0 1.08 perf-profile.calltrace.cycles-pp.clear_bhb_loop.mremap 0.75 -0.0 0.72 perf-profile.calltrace.cycles-pp.allocate_slab.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma 0.74 -0.0 0.71 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.anon_vma_clone.copy_vma.move_vma.__do_sys_mremap 0.60 ± 2% -0.0 0.57 perf-profile.calltrace.cycles-pp.security_mmap_addr.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64 0.67 ± 2% -0.0 0.64 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma 0.82 -0.0 0.79 perf-profile.calltrace.cycles-pp.mtree_load.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 0.63 -0.0 0.60 perf-profile.calltrace.cycles-pp.mas_prev_slot.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 0.99 -0.0 0.96 perf-profile.calltrace.cycles-pp.mt_find.vma_merge.copy_vma.move_vma.__do_sys_mremap 0.62 ± 2% -0.0 0.59 perf-profile.calltrace.cycles-pp.get_old_pud.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64 0.87 -0.0 0.84 perf-profile.calltrace.cycles-pp.userfaultfd_unmap_complete.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 0.78 -0.0 0.75 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_link.copy_vma.move_vma.__do_sys_mremap 0.64 -0.0 0.62 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_link.copy_vma.move_vma.__do_sys_mremap 0.90 -0.0 0.87 perf-profile.calltrace.cycles-pp.mtree_load.vma_merge.copy_vma.move_vma.__do_sys_mremap 0.54 -0.0 0.52 perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.move_ptes.move_page_tables.move_vma.__do_sys_mremap 1.04 +0.0 1.08 perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region 0.76 +0.1 0.83 perf-profile.calltrace.cycles-pp.__madvise 0.63 +0.1 0.70 perf-profile.calltrace.cycles-pp.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise 0.62 +0.1 0.70 perf-profile.calltrace.cycles-pp.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise 0.66 +0.1 0.74 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__madvise 0.66 +0.1 0.74 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise 87.74 +0.7 88.45 perf-profile.calltrace.cycles-pp.mremap 0.00 +0.9 0.86 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.do_munmap 0.00 +0.9 0.86 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.mremap_to.__do_sys_mremap 84.88 +0.9 85.77 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.mremap 84.73 +0.9 85.62 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 0.00 +0.9 0.92 ± 2% perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.move_vma 83.84 +0.9 84.78 perf-profile.calltrace.cycles-pp.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 0.00 +1.1 1.06 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64 0.00 +1.2 1.21 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to 2.07 +1.5 3.55 perf-profile.calltrace.cycles-pp.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 1.58 +1.5 3.07 perf-profile.calltrace.cycles-pp.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64 0.00 +1.5 1.52 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap 0.00 +1.6 1.57 perf-profile.calltrace.cycles-pp.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.00 +1.7 1.72 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap 0.00 +2.0 2.01 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 5.39 +2.9 8.32 perf-profile.calltrace.cycles-pp.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 75.29 -1.9 73.37 perf-profile.children.cycles-pp.move_vma 37.06 -1.6 35.50 perf-profile.children.cycles-pp.do_vmi_align_munmap 24.98 -1.2 23.80 perf-profile.children.cycles-pp.copy_vma 19.99 -1.0 19.02 perf-profile.children.cycles-pp.handle_softirqs 19.97 -1.0 19.00 perf-profile.children.cycles-pp.rcu_core 19.95 -1.0 18.98 perf-profile.children.cycles-pp.rcu_do_batch 19.98 -0.9 19.06 perf-profile.children.cycles-pp.__split_vma 17.55 -0.8 16.76 perf-profile.children.cycles-pp.kmem_cache_free 10.56 ± 2% -0.8 9.79 ± 2% perf-profile.children.cycles-pp.run_ksoftirqd 10.57 ± 2% -0.8 9.80 ± 2% perf-profile.children.cycles-pp.smpboot_thread_fn 15.38 -0.8 14.62 perf-profile.children.cycles-pp.kmem_cache_alloc_noprof 10.62 ± 2% -0.8 9.85 ± 2% perf-profile.children.cycles-pp.kthread 10.62 ± 2% -0.8 9.86 ± 2% perf-profile.children.cycles-pp.ret_from_fork 10.62 ± 2% -0.8 9.86 ± 2% perf-profile.children.cycles-pp.ret_from_fork_asm 15.14 -0.7 14.44 perf-profile.children.cycles-pp.vma_merge 12.08 -0.5 11.55 perf-profile.children.cycles-pp.__slab_free 12.11 -0.5 11.62 perf-profile.children.cycles-pp.mas_wr_store_entry 10.86 -0.5 10.39 perf-profile.children.cycles-pp.vm_area_dup 11.89 -0.5 11.44 perf-profile.children.cycles-pp.mas_store_prealloc 8.49 -0.4 8.06 perf-profile.children.cycles-pp.__memcg_slab_post_alloc_hook 9.88 -0.4 9.49 perf-profile.children.cycles-pp.mas_wr_node_store 7.91 -0.3 7.58 perf-profile.children.cycles-pp.move_page_tables 6.06 -0.3 5.78 perf-profile.children.cycles-pp.vm_area_free_rcu_cb 8.28 -0.3 8.00 perf-profile.children.cycles-pp.unmap_region 6.69 -0.3 6.42 perf-profile.children.cycles-pp.vma_complete 5.06 -0.3 4.80 perf-profile.children.cycles-pp.mas_preallocate 5.82 -0.2 5.57 perf-profile.children.cycles-pp.move_ptes 4.24 -0.2 4.01 perf-profile.children.cycles-pp.anon_vma_clone 3.50 -0.2 3.30 perf-profile.children.cycles-pp.down_write 2.44 -0.2 2.25 perf-profile.children.cycles-pp.find_vma_prev 3.46 -0.2 3.28 perf-profile.children.cycles-pp.___slab_alloc 3.45 -0.2 3.27 perf-profile.children.cycles-pp.free_pgtables 2.54 -0.2 2.37 perf-profile.children.cycles-pp.rcu_cblist_dequeue 3.35 -0.2 3.18 perf-profile.children.cycles-pp.__memcg_slab_free_hook 2.93 -0.2 2.78 perf-profile.children.cycles-pp.mas_alloc_nodes 2.28 ± 2% -0.2 2.12 ± 2% perf-profile.children.cycles-pp.vma_prepare 3.46 -0.1 3.32 perf-profile.children.cycles-pp.flush_tlb_mm_range 3.41 -0.1 3.27 ± 2% perf-profile.children.cycles-pp.mod_objcg_state 2.76 -0.1 2.63 perf-profile.children.cycles-pp.unlink_anon_vmas 3.41 -0.1 3.28 perf-profile.children.cycles-pp.mas_store_gfp 2.21 -0.1 2.09 perf-profile.children.cycles-pp.__cond_resched 2.04 -0.1 1.94 perf-profile.children.cycles-pp.allocate_slab 2.10 -0.1 2.00 perf-profile.children.cycles-pp.__call_rcu_common 2.51 -0.1 2.40 perf-profile.children.cycles-pp.flush_tlb_func 1.04 -0.1 0.94 perf-profile.children.cycles-pp.mas_prev 2.71 -0.1 2.61 perf-profile.children.cycles-pp.mtree_load 2.23 -0.1 2.14 perf-profile.children.cycles-pp.native_flush_tlb_one_user 0.22 ± 5% -0.1 0.13 ± 13% perf-profile.children.cycles-pp.vm_stat_account 0.95 -0.1 0.87 perf-profile.children.cycles-pp.mas_prev_setup 1.65 -0.1 1.57 perf-profile.children.cycles-pp.mas_wr_walk 1.84 -0.1 1.76 perf-profile.children.cycles-pp.up_write 1.27 -0.1 1.20 perf-profile.children.cycles-pp.mas_prev_slot 1.84 -0.1 1.77 perf-profile.children.cycles-pp.vma_link 1.39 -0.1 1.32 perf-profile.children.cycles-pp.shuffle_freelist 0.96 -0.1 0.90 ± 2% perf-profile.children.cycles-pp.rcu_all_qs 0.86 -0.1 0.80 perf-profile.children.cycles-pp._raw_spin_lock_irqsave 1.70 -0.1 1.64 perf-profile.children.cycles-pp.__get_unmapped_area 0.34 ± 3% -0.1 0.29 ± 5% perf-profile.children.cycles-pp.security_vm_enough_memory_mm 0.60 -0.0 0.55 perf-profile.children.cycles-pp.entry_SYSCALL_64 0.92 -0.0 0.87 perf-profile.children.cycles-pp.percpu_counter_add_batch 1.07 -0.0 1.02 perf-profile.children.cycles-pp.vma_to_resize 1.59 -0.0 1.54 perf-profile.children.cycles-pp.mas_update_gap 0.44 ± 2% -0.0 0.40 ± 2% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath 0.70 -0.0 0.66 perf-profile.children.cycles-pp.syscall_return_via_sysret 1.13 -0.0 1.09 perf-profile.children.cycles-pp.mt_find 0.20 ± 6% -0.0 0.17 ± 9% perf-profile.children.cycles-pp.cap_vm_enough_memory 0.99 -0.0 0.95 perf-profile.children.cycles-pp.mas_pop_node 0.63 ± 2% -0.0 0.59 perf-profile.children.cycles-pp.security_mmap_addr 0.62 -0.0 0.59 perf-profile.children.cycles-pp.__put_partials 1.17 -0.0 1.14 perf-profile.children.cycles-pp.clear_bhb_loop 0.46 -0.0 0.43 ± 2% perf-profile.children.cycles-pp.__alloc_pages_noprof 0.44 -0.0 0.41 ± 2% perf-profile.children.cycles-pp.get_page_from_freelist 0.90 -0.0 0.87 perf-profile.children.cycles-pp.userfaultfd_unmap_complete 0.64 ± 2% -0.0 0.62 perf-profile.children.cycles-pp.get_old_pud 1.07 -0.0 1.05 perf-profile.children.cycles-pp.mas_leaf_max_gap 0.22 ± 3% -0.0 0.20 ± 2% perf-profile.children.cycles-pp.__rmqueue_pcplist 0.55 -0.0 0.53 perf-profile.children.cycles-pp.refill_obj_stock 0.25 -0.0 0.23 ± 3% perf-profile.children.cycles-pp.rmqueue 0.48 -0.0 0.45 perf-profile.children.cycles-pp.mremap_userfaultfd_prep 0.33 -0.0 0.30 perf-profile.children.cycles-pp.free_unref_page 0.46 -0.0 0.44 perf-profile.children.cycles-pp.setup_object 0.21 ± 3% -0.0 0.19 ± 2% perf-profile.children.cycles-pp.rmqueue_bulk 0.31 ± 3% -0.0 0.29 perf-profile.children.cycles-pp.__vm_enough_memory 0.40 -0.0 0.38 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack 0.36 -0.0 0.35 perf-profile.children.cycles-pp.madvise_vma_behavior 0.54 -0.0 0.53 ± 2% perf-profile.children.cycles-pp.mas_wr_end_piv 0.46 -0.0 0.44 ± 2% perf-profile.children.cycles-pp.rcu_segcblist_enqueue 0.34 -0.0 0.32 ± 2% perf-profile.children.cycles-pp.mas_destroy 0.28 -0.0 0.26 ± 3% perf-profile.children.cycles-pp.mas_wr_store_setup 0.30 -0.0 0.28 perf-profile.children.cycles-pp.pte_offset_map_nolock 0.19 -0.0 0.18 ± 2% perf-profile.children.cycles-pp.__thp_vma_allowable_orders 0.08 ± 4% -0.0 0.07 perf-profile.children.cycles-pp.ksm_madvise 0.17 -0.0 0.16 perf-profile.children.cycles-pp.get_any_partial 0.08 -0.0 0.07 perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare 0.45 +0.0 0.47 perf-profile.children.cycles-pp._raw_spin_lock 1.10 +0.0 1.14 perf-profile.children.cycles-pp.zap_pte_range 0.78 +0.1 0.85 perf-profile.children.cycles-pp.__madvise 0.63 +0.1 0.70 perf-profile.children.cycles-pp.__x64_sys_madvise 0.62 +0.1 0.70 perf-profile.children.cycles-pp.do_madvise 0.00 +0.1 0.09 ± 4% perf-profile.children.cycles-pp.can_modify_mm_madv 1.32 +0.1 1.46 perf-profile.children.cycles-pp.mas_next_slot 88.13 +0.7 88.83 perf-profile.children.cycles-pp.mremap 83.94 +0.9 84.88 perf-profile.children.cycles-pp.__do_sys_mremap 86.06 +0.9 87.00 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe 85.56 +1.0 86.54 perf-profile.children.cycles-pp.do_syscall_64 40.49 +1.4 41.90 perf-profile.children.cycles-pp.do_vmi_munmap 2.10 +1.5 3.57 perf-profile.children.cycles-pp.do_munmap 3.62 +2.3 5.90 perf-profile.children.cycles-pp.mas_walk 5.44 +2.9 8.38 perf-profile.children.cycles-pp.mremap_to 5.30 +3.1 8.39 perf-profile.children.cycles-pp.mas_find 0.00 +5.4 5.40 perf-profile.children.cycles-pp.can_modify_mm 11.46 -0.5 10.96 perf-profile.self.cycles-pp.__slab_free 4.30 -0.2 4.08 perf-profile.self.cycles-pp.__memcg_slab_post_alloc_hook 2.51 -0.2 2.34 perf-profile.self.cycles-pp.rcu_cblist_dequeue 2.41 ± 2% -0.2 2.25 perf-profile.self.cycles-pp.down_write 2.21 -0.1 2.11 perf-profile.self.cycles-pp.native_flush_tlb_one_user 2.37 -0.1 2.28 perf-profile.self.cycles-pp.mtree_load 1.60 -0.1 1.51 perf-profile.self.cycles-pp.__memcg_slab_free_hook 0.18 ± 3% -0.1 0.10 ± 15% perf-profile.self.cycles-pp.vm_stat_account 1.25 -0.1 1.18 perf-profile.self.cycles-pp.move_vma 1.76 -0.1 1.69 perf-profile.self.cycles-pp.mod_objcg_state 1.42 -0.1 1.35 ± 2% perf-profile.self.cycles-pp.__call_rcu_common 1.41 -0.1 1.34 perf-profile.self.cycles-pp.mas_wr_walk 1.52 -0.1 1.46 perf-profile.self.cycles-pp.up_write 1.02 -0.1 0.95 perf-profile.self.cycles-pp.mas_prev_slot 0.96 -0.1 0.90 ± 2% perf-profile.self.cycles-pp.vm_area_free_rcu_cb 1.50 -0.1 1.45 perf-profile.self.cycles-pp.kmem_cache_free 0.69 ± 3% -0.1 0.64 ± 2% perf-profile.self.cycles-pp.rcu_all_qs 1.14 ± 2% -0.1 1.09 perf-profile.self.cycles-pp.shuffle_freelist 1.10 -0.1 1.05 perf-profile.self.cycles-pp.__cond_resched 1.40 -0.0 1.35 perf-profile.self.cycles-pp.do_vmi_align_munmap 0.99 -0.0 0.94 perf-profile.self.cycles-pp.mas_preallocate 0.88 -0.0 0.83 perf-profile.self.cycles-pp.___slab_alloc 0.55 -0.0 0.50 perf-profile.self.cycles-pp.mremap_to 0.98 -0.0 0.93 perf-profile.self.cycles-pp.move_ptes 0.78 -0.0 0.74 perf-profile.self.cycles-pp.percpu_counter_add_batch 0.21 ± 2% -0.0 0.18 ± 2% perf-profile.self.cycles-pp.entry_SYSCALL_64 0.44 ± 2% -0.0 0.40 ± 2% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath 0.92 -0.0 0.89 perf-profile.self.cycles-pp.mas_store_gfp 0.86 -0.0 0.82 perf-profile.self.cycles-pp.mas_pop_node 0.50 -0.0 0.46 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe 1.15 -0.0 1.12 perf-profile.self.cycles-pp.clear_bhb_loop 1.14 -0.0 1.11 perf-profile.self.cycles-pp.vma_merge 0.66 -0.0 0.63 perf-profile.self.cycles-pp.__split_vma 0.16 ± 6% -0.0 0.13 ± 7% perf-profile.self.cycles-pp.cap_vm_enough_memory 0.82 -0.0 0.79 perf-profile.self.cycles-pp.mas_wr_store_entry 0.54 ± 2% -0.0 0.52 perf-profile.self.cycles-pp.get_old_pud 0.43 -0.0 0.40 perf-profile.self.cycles-pp.do_munmap 0.51 ± 2% -0.0 0.48 ± 2% perf-profile.self.cycles-pp.security_mmap_addr 0.50 -0.0 0.48 perf-profile.self.cycles-pp.refill_obj_stock 0.24 -0.0 0.22 perf-profile.self.cycles-pp.mas_prev 0.71 -0.0 0.69 perf-profile.self.cycles-pp.unmap_page_range 0.48 -0.0 0.45 perf-profile.self.cycles-pp.find_vma_prev 0.42 -0.0 0.40 perf-profile.self.cycles-pp._raw_spin_lock_irqsave 0.66 -0.0 0.64 perf-profile.self.cycles-pp.mas_store_prealloc 0.31 -0.0 0.29 perf-profile.self.cycles-pp.mas_prev_setup 0.43 -0.0 0.41 perf-profile.self.cycles-pp.mas_wr_end_piv 0.78 -0.0 0.76 perf-profile.self.cycles-pp.userfaultfd_unmap_complete 0.28 -0.0 0.26 ± 2% perf-profile.self.cycles-pp.mas_put_in_tree 0.42 -0.0 0.40 perf-profile.self.cycles-pp.mremap_userfaultfd_prep 0.28 -0.0 0.26 perf-profile.self.cycles-pp.free_pgtables 0.39 -0.0 0.37 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack 0.30 ± 2% -0.0 0.28 perf-profile.self.cycles-pp.zap_pmd_range 0.32 -0.0 0.31 perf-profile.self.cycles-pp.unmap_vmas 0.21 -0.0 0.20 perf-profile.self.cycles-pp.__get_unmapped_area 0.18 ± 2% -0.0 0.17 ± 2% perf-profile.self.cycles-pp.lru_add_drain_cpu 0.06 -0.0 0.05 perf-profile.self.cycles-pp.ksm_madvise 0.45 +0.0 0.46 perf-profile.self.cycles-pp.do_vmi_munmap 0.37 +0.0 0.39 perf-profile.self.cycles-pp._raw_spin_lock 1.06 +0.1 1.18 perf-profile.self.cycles-pp.mas_next_slot 1.50 +0.5 1.97 perf-profile.self.cycles-pp.mas_find 0.00 +1.4 1.35 perf-profile.self.cycles-pp.can_modify_mm 3.13 +2.0 5.13 perf-profile.self.cycles-pp.mas_walk *************************************************************************************************** lkp-spr-r02: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory ========================================================================================= compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: gcc-13/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-spr-r02/pkey/stress-ng/60s commit: ff388fe5c4 ("mseal: wire up mseal syscall") 8be7258aad ("mseal: add mseal syscall") ff388fe5c481d39c 8be7258aad44b5e25977a98db13 ---------------- --------------------------- %stddev %change %stddev \ | \ 10539 -2.5% 10273 vmstat.system.cs 0.28 ± 5% -20.1% 0.22 ± 7% sched_debug.cfs_rq:/.h_nr_running.stddev 1419 ± 7% -15.3% 1202 ± 6% sched_debug.cfs_rq:/.util_avg.max 0.28 ± 6% -18.4% 0.23 ± 8% sched_debug.cpu.nr_running.stddev 8.736e+08 -3.6% 8.423e+08 stress-ng.pkey.ops 14560560 -3.6% 14038795 stress-ng.pkey.ops_per_sec 770.39 ± 4% -5.0% 732.04 stress-ng.time.user_time 244657 ± 3% +5.8% 258782 ± 3% proc-vmstat.nr_slab_unreclaimable 73133541 -2.1% 71588873 proc-vmstat.numa_hit 72873579 -2.1% 71357274 proc-vmstat.numa_local 1.842e+08 -2.5% 1.796e+08 proc-vmstat.pgalloc_normal 1.767e+08 -2.8% 1.717e+08 proc-vmstat.pgfree 1345346 ± 40% -73.1% 362064 ±124% numa-vmstat.node0.nr_inactive_anon 1345340 ± 40% -73.1% 362062 ±124% numa-vmstat.node0.nr_zone_inactive_anon 2420830 ± 14% +35.1% 3270248 ± 16% numa-vmstat.node1.nr_file_pages 2067871 ± 13% +51.5% 3132982 ± 17% numa-vmstat.node1.nr_inactive_anon 191406 ± 17% +33.6% 255808 ± 14% numa-vmstat.node1.nr_mapped 2452 ± 61% +104.4% 5012 ± 35% numa-vmstat.node1.nr_page_table_pages 2067853 ± 13% +51.5% 3132966 ± 17% numa-vmstat.node1.nr_zone_inactive_anon 5379238 ± 40% -73.0% 1453605 ±123% numa-meminfo.node0.Inactive 5379166 ± 40% -73.0% 1453462 ±123% numa-meminfo.node0.Inactive(anon) 8741077 ± 22% -36.7% 5531290 ± 28% numa-meminfo.node0.MemUsed 9651902 ± 13% +35.8% 13105318 ± 16% numa-meminfo.node1.FilePages 8239855 ± 13% +52.4% 12556929 ± 17% numa-meminfo.node1.Inactive 8239712 ± 13% +52.4% 12556853 ± 17% numa-meminfo.node1.Inactive(anon) 761944 ± 18% +34.6% 1025906 ± 14% numa-meminfo.node1.Mapped 11679628 ± 11% +31.2% 15322841 ± 14% numa-meminfo.node1.MemUsed 9874 ± 62% +104.6% 20200 ± 36% numa-meminfo.node1.PageTables 0.74 -4.2% 0.71 perf-stat.i.MPKI 1.245e+11 +2.3% 1.274e+11 perf-stat.i.branch-instructions 0.37 -0.0 0.35 perf-stat.i.branch-miss-rate% 4.359e+08 -2.1% 4.265e+08 perf-stat.i.branch-misses 4.672e+08 -2.6% 4.548e+08 perf-stat.i.cache-misses 7.276e+08 -2.7% 7.082e+08 perf-stat.i.cache-references 1.00 -1.6% 0.98 perf-stat.i.cpi 1364 +2.9% 1404 perf-stat.i.cycles-between-cache-misses 6.392e+11 +1.7% 6.499e+11 perf-stat.i.instructions 1.00 +1.6% 1.02 perf-stat.i.ipc 0.74 -4.3% 0.71 perf-stat.overall.MPKI 0.35 -0.0 0.33 perf-stat.overall.branch-miss-rate% 1.00 -1.6% 0.99 perf-stat.overall.cpi 1356 +2.9% 1395 perf-stat.overall.cycles-between-cache-misses 1.00 +1.6% 1.01 perf-stat.overall.ipc 1.209e+11 +1.9% 1.232e+11 perf-stat.ps.branch-instructions 4.188e+08 -2.6% 4.077e+08 perf-stat.ps.branch-misses 4.585e+08 -3.1% 4.441e+08 perf-stat.ps.cache-misses 7.124e+08 -3.1% 6.901e+08 perf-stat.ps.cache-references 10321 -2.6% 10053 perf-stat.ps.context-switches Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression 2024-08-04 8:59 [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression kernel test robot @ 2024-08-04 20:32 ` Linus Torvalds 2024-08-05 13:33 ` Pedro Falcato 2024-08-05 17:54 ` Jeff Xu 2024-08-05 13:56 ` Jeff Xu 2024-08-05 16:58 ` Jeff Xu 2 siblings, 2 replies; 29+ messages in thread From: Linus Torvalds @ 2024-08-04 20:32 UTC (permalink / raw) To: kernel test robot Cc: Jeff Xu, oe-lkp, lkp, linux-kernel, Andrew Morton, Kees Cook, Liam R. Howlett, Pedro Falcato, Dave Hansen, Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jeff Xu, Jonathan Corbet, Jorge Lucangeli Obes, Matthew Wilcox, Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang, feng.tang, fengwei.yin On Sun, 4 Aug 2024 at 01:59, kernel test robot <oliver.sang@intel.com> wrote: > > kernel test robot noticed a -4.4% regression of stress-ng.pagemove.page_remaps_per_sec on > commit 8be7258aad44 ("mseal: add mseal syscall") Ok, it's basically just the vma walk in can_modify_mm(): > 1.06 +0.1 1.18 perf-profile.self.cycles-pp.mas_next_slot > 1.50 +0.5 1.97 perf-profile.self.cycles-pp.mas_find > 0.00 +1.4 1.35 perf-profile.self.cycles-pp.can_modify_mm > 3.13 +2.0 5.13 perf-profile.self.cycles-pp.mas_walk and looks like it's two different pathways. We have __do_sys_mremap -> mremap_to -> do_munmap -> do_vmi_munmap -> can_modify_mm for the destination mapping, but we also have mremap_to() calling can_modify_mm() directly for the source mapping. And then do_vmi_munmap() will do it's *own* vma_find() after having done arch_unmap(). And do_munmap() will obviously do its own vma lookup as part of calling vma_to_resize(). So it looks like a large portion of this regression is because the mseal addition just ends up walking the vma list way too much. Linus ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression 2024-08-04 20:32 ` Linus Torvalds @ 2024-08-05 13:33 ` Pedro Falcato 2024-08-05 18:10 ` Jeff Xu 2024-08-05 17:54 ` Jeff Xu 1 sibling, 1 reply; 29+ messages in thread From: Pedro Falcato @ 2024-08-05 13:33 UTC (permalink / raw) To: Linus Torvalds Cc: kernel test robot, Jeff Xu, oe-lkp, lkp, linux-kernel, Andrew Morton, Kees Cook, Liam R. Howlett, Dave Hansen, Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jeff Xu, Jonathan Corbet, Jorge Lucangeli Obes, Matthew Wilcox, Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang, feng.tang, fengwei.yin On Sun, Aug 4, 2024 at 9:33 PM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > On Sun, 4 Aug 2024 at 01:59, kernel test robot <oliver.sang@intel.com> wrote: > > > > kernel test robot noticed a -4.4% regression of stress-ng.pagemove.page_remaps_per_sec on > > commit 8be7258aad44 ("mseal: add mseal syscall") > > Ok, it's basically just the vma walk in can_modify_mm(): > > > 1.06 +0.1 1.18 perf-profile.self.cycles-pp.mas_next_slot > > 1.50 +0.5 1.97 perf-profile.self.cycles-pp.mas_find > > 0.00 +1.4 1.35 perf-profile.self.cycles-pp.can_modify_mm > > 3.13 +2.0 5.13 perf-profile.self.cycles-pp.mas_walk > > and looks like it's two different pathways. We have __do_sys_mremap -> > mremap_to -> do_munmap -> do_vmi_munmap -> can_modify_mm for the > destination mapping, but we also have mremap_to() calling > can_modify_mm() directly for the source mapping. > > And then do_vmi_munmap() will do it's *own* vma_find() after having > done arch_unmap(). > > And do_munmap() will obviously do its own vma lookup as part of > calling vma_to_resize(). > > So it looks like a large portion of this regression is because the > mseal addition just ends up walking the vma list way too much. Can we rollback the upfront checks "funny business" and just call can_modify_vma directly in relevant places? I still don't believe in the partial mprotect/munmap "security risks" that were stated in the mseal thread (and these operations can already fail for many other reasons than mseal) :) I don't mind taking a look myself, just want to make sure I'm not stepping on anyone's toes here. -- Pedro ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression 2024-08-05 13:33 ` Pedro Falcato @ 2024-08-05 18:10 ` Jeff Xu 2024-08-05 18:55 ` Linus Torvalds 0 siblings, 1 reply; 29+ messages in thread From: Jeff Xu @ 2024-08-05 18:10 UTC (permalink / raw) To: Pedro Falcato Cc: Linus Torvalds, kernel test robot, Jeff Xu, oe-lkp, lkp, linux-kernel, Andrew Morton, Kees Cook, Liam R. Howlett, Dave Hansen, Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jonathan Corbet, Jorge Lucangeli Obes, Matthew Wilcox, Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang, feng.tang, fengwei.yin On Mon, Aug 5, 2024 at 6:33 AM Pedro Falcato <pedro.falcato@gmail.com> wrote: > > On Sun, Aug 4, 2024 at 9:33 PM Linus Torvalds > <torvalds@linux-foundation.org> wrote: > > > > On Sun, 4 Aug 2024 at 01:59, kernel test robot <oliver.sang@intel.com> wrote: > > > > > > kernel test robot noticed a -4.4% regression of stress-ng.pagemove.page_remaps_per_sec on > > > commit 8be7258aad44 ("mseal: add mseal syscall") > > > > Ok, it's basically just the vma walk in can_modify_mm(): > > > > > 1.06 +0.1 1.18 perf-profile.self.cycles-pp.mas_next_slot > > > 1.50 +0.5 1.97 perf-profile.self.cycles-pp.mas_find > > > 0.00 +1.4 1.35 perf-profile.self.cycles-pp.can_modify_mm > > > 3.13 +2.0 5.13 perf-profile.self.cycles-pp.mas_walk > > > > and looks like it's two different pathways. We have __do_sys_mremap -> > > mremap_to -> do_munmap -> do_vmi_munmap -> can_modify_mm for the > > destination mapping, but we also have mremap_to() calling > > can_modify_mm() directly for the source mapping. > > > > And then do_vmi_munmap() will do it's *own* vma_find() after having > > done arch_unmap(). > > > > And do_munmap() will obviously do its own vma lookup as part of > > calling vma_to_resize(). > > > > So it looks like a large portion of this regression is because the > > mseal addition just ends up walking the vma list way too much. > > Can we rollback the upfront checks "funny business" and just call > can_modify_vma directly in relevant places? I still don't believe in > the partial mprotect/munmap "security risks" that were stated in the > mseal thread (and these operations can already fail for many other > reasons than mseal) :) > In-place check and extra loop, implemented properly, will both prevent changing to the sealed memory. However, extra loop will make attacker difficult to call munmap(0, random large-size), because if one of vma in the range is sealed, the whole operation will be no-op. > I don't mind taking a look myself, just want to make sure I'm not > stepping on anyone's toes here. > One thing that you can't walk around is that can_modify_mm must be called prior to arch_unmap, that means in-place check for the munmap is not possible. ( There are recent patch / refactor by Liam R. Howlett in this area, but I am not sure if this restriction is removed) > -- > Pedro ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression 2024-08-05 18:10 ` Jeff Xu @ 2024-08-05 18:55 ` Linus Torvalds 2024-08-05 19:33 ` Linus Torvalds 2024-08-05 19:37 ` Jeff Xu 0 siblings, 2 replies; 29+ messages in thread From: Linus Torvalds @ 2024-08-05 18:55 UTC (permalink / raw) To: Jeff Xu, Michael Ellerman, Nicholas Piggin, Christophe Leroy Cc: Pedro Falcato, kernel test robot, Jeff Xu, oe-lkp, lkp, linux-kernel, Andrew Morton, Kees Cook, Liam R. Howlett, Dave Hansen, Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jonathan Corbet, Jorge Lucangeli Obes, Matthew Wilcox, Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang, feng.tang, fengwei.yin [-- Attachment #1: Type: text/plain, Size: 1685 bytes --] On Mon, 5 Aug 2024 at 11:11, Jeff Xu <jeffxu@google.com> wrote: > > One thing that you can't walk around is that can_modify_mm must be > called prior to arch_unmap, that means in-place check for the munmap > is not possible. Actually, we should move 'arch_unmap()'. There is only one user of it, and it's pretty pointless. (Ok, there are two users - x86 also has an 'arch_unmap()', but it's empty). The reason I say that the current user of arch_unmap() is pointless is because this is what the powerpc user does: static inline void arch_unmap(struct mm_struct *mm, unsigned long start, unsigned long end) { unsigned long vdso_base = (unsigned long)mm->context.vdso; if (start <= vdso_base && vdso_base < end) mm->context.vdso = NULL; } and that would make sense if we didn't have an actual 'vma' that matched the vdso. But we do. I think this code may predate the whole "create a vma for the vdso" code. Or maybe it was just always confused. Anyway, what the code *should* do is that we should just have a ->close() function for special mappings, and call that in special_mapping_close(). This is an ENTIRELY UNTESTED patch that gets rid of this horrendous wart. Michael / Nick / Christophe? Note that I didn't even compile-test this on x86-64, much less on powerpc. So please consider this a "maybe something like this" patch, but that 'arch_unmap()' really is pretty nasty. Oh, and there was a bug in the error path of the powerpc vdso setup code anyway. The patch fixes that too, although considering the entirely untested nature of it, the "fixes" is laughably optimistic. Linus [-- Attachment #2: patch.diff --] [-- Type: text/x-patch, Size: 6309 bytes --] arch/powerpc/include/asm/mmu_context.h | 9 --------- arch/powerpc/kernel/vdso.c | 12 +++++++++++- arch/x86/include/asm/mmu_context.h | 5 ----- include/asm-generic/mm_hooks.h | 11 +++-------- include/linux/mm_types.h | 2 ++ mm/mmap.c | 15 ++++++--------- 6 files changed, 22 insertions(+), 32 deletions(-) diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h index 37bffa0f7918..a334a1368848 100644 --- a/arch/powerpc/include/asm/mmu_context.h +++ b/arch/powerpc/include/asm/mmu_context.h @@ -260,15 +260,6 @@ static inline void enter_lazy_tlb(struct mm_struct *mm, extern void arch_exit_mmap(struct mm_struct *mm); -static inline void arch_unmap(struct mm_struct *mm, - unsigned long start, unsigned long end) -{ - unsigned long vdso_base = (unsigned long)mm->context.vdso; - - if (start <= vdso_base && vdso_base < end) - mm->context.vdso = NULL; -} - #ifdef CONFIG_PPC_MEM_KEYS bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write, bool execute, bool foreign); diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c index 7a2ff9010f17..4de8af43f920 100644 --- a/arch/powerpc/kernel/vdso.c +++ b/arch/powerpc/kernel/vdso.c @@ -81,12 +81,20 @@ static int vdso64_mremap(const struct vm_special_mapping *sm, struct vm_area_str return vdso_mremap(sm, new_vma, &vdso64_end - &vdso64_start); } +static int vvar_close(const struct vm_special_mapping *sm, + struct vm_area_struct *vma) +{ + struct mm_struct *mm = vma->vm_mm; + mm->context.vdso = NULL; +} + static vm_fault_t vvar_fault(const struct vm_special_mapping *sm, struct vm_area_struct *vma, struct vm_fault *vmf); static struct vm_special_mapping vvar_spec __ro_after_init = { .name = "[vvar]", .fault = vvar_fault, + .close = vvar_close, }; static struct vm_special_mapping vdso32_spec __ro_after_init = { @@ -207,8 +215,10 @@ static int __arch_setup_additional_pages(struct linux_binprm *bprm, int uses_int vma = _install_special_mapping(mm, vdso_base, vvar_size, VM_READ | VM_MAYREAD | VM_IO | VM_DONTDUMP | VM_PFNMAP, &vvar_spec); - if (IS_ERR(vma)) + if (IS_ERR(vma)) { + mm->context.vdso = NULL; return PTR_ERR(vma); + } /* * our vma flags don't have VM_WRITE so by default, the process isn't diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h index 8dac45a2c7fc..80f2a3187aa6 100644 --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -232,11 +232,6 @@ static inline bool is_64bit_mm(struct mm_struct *mm) } #endif -static inline void arch_unmap(struct mm_struct *mm, unsigned long start, - unsigned long end) -{ -} - /* * We only want to enforce protection keys on the current process * because we effectively have no access to PKRU for other diff --git a/include/asm-generic/mm_hooks.h b/include/asm-generic/mm_hooks.h index 4dbb177d1150..6eea3b3c1e65 100644 --- a/include/asm-generic/mm_hooks.h +++ b/include/asm-generic/mm_hooks.h @@ -1,8 +1,8 @@ /* SPDX-License-Identifier: GPL-2.0 */ /* - * Define generic no-op hooks for arch_dup_mmap, arch_exit_mmap - * and arch_unmap to be included in asm-FOO/mmu_context.h for any - * arch FOO which doesn't need to hook these. + * Define generic no-op hooks for arch_dup_mmap and arch_exit_mmap + * to be included in asm-FOO/mmu_context.h for any arch FOO which + * doesn't need to hook these. */ #ifndef _ASM_GENERIC_MM_HOOKS_H #define _ASM_GENERIC_MM_HOOKS_H @@ -17,11 +17,6 @@ static inline void arch_exit_mmap(struct mm_struct *mm) { } -static inline void arch_unmap(struct mm_struct *mm, - unsigned long start, unsigned long end) -{ -} - static inline bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write, bool execute, bool foreign) { diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 485424979254..ef32d87a3adc 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -1313,6 +1313,8 @@ struct vm_special_mapping { int (*mremap)(const struct vm_special_mapping *sm, struct vm_area_struct *new_vma); + void (*close)(const struct vm_special_mapping *sm, + struct vm_area_struct *vma); }; enum tlb_flush_reason { diff --git a/mm/mmap.c b/mm/mmap.c index d0dfc85b209b..adaaf1ef197a 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -2789,7 +2789,7 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma, * * This function takes a @mas that is either pointing to the previous VMA or set * to MA_START and sets it up to remove the mapping(s). The @len will be - * aligned and any arch_unmap work will be preformed. + * aligned. * * Return: 0 on success and drops the lock if so directed, error and leaves the * lock held otherwise. @@ -2809,16 +2809,12 @@ int do_vmi_munmap(struct vma_iterator *vmi, struct mm_struct *mm, return -EINVAL; /* - * Check if memory is sealed before arch_unmap. - * Prevent unmapping a sealed VMA. + * Check if memory is sealed, prevent unmapping a sealed VMA. * can_modify_mm assumes we have acquired the lock on MM. */ if (unlikely(!can_modify_mm(mm, start, end))) return -EPERM; - /* arch_unmap() might do unmaps itself. */ - arch_unmap(mm, start, end); - /* Find the first overlapping VMA */ vma = vma_find(vmi, end); if (!vma) { @@ -3232,14 +3228,12 @@ int do_vma_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma, struct mm_struct *mm = vma->vm_mm; /* - * Check if memory is sealed before arch_unmap. - * Prevent unmapping a sealed VMA. + * Check if memory is sealed, prevent unmapping a sealed VMA. * can_modify_mm assumes we have acquired the lock on MM. */ if (unlikely(!can_modify_mm(mm, start, end))) return -EPERM; - arch_unmap(mm, start, end); return do_vmi_align_munmap(vmi, vma, mm, start, end, uf, unlock); } @@ -3624,6 +3618,9 @@ static vm_fault_t special_mapping_fault(struct vm_fault *vmf); */ static void special_mapping_close(struct vm_area_struct *vma) { + const struct vm_special_mapping *sm = vma->vm_private_data; + if (sm->close) + sm->close(sm, vma); } static const char *special_mapping_name(struct vm_area_struct *vma) ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression 2024-08-05 18:55 ` Linus Torvalds @ 2024-08-05 19:33 ` Linus Torvalds 2024-08-06 2:14 ` Michael Ellerman 2024-08-06 6:04 ` Oliver Sang 2024-08-05 19:37 ` Jeff Xu 1 sibling, 2 replies; 29+ messages in thread From: Linus Torvalds @ 2024-08-05 19:33 UTC (permalink / raw) To: Jeff Xu, Michael Ellerman, Nicholas Piggin, Christophe Leroy Cc: Pedro Falcato, kernel test robot, Jeff Xu, oe-lkp, lkp, linux-kernel, Andrew Morton, Kees Cook, Liam R. Howlett, Dave Hansen, Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jonathan Corbet, Jorge Lucangeli Obes, Matthew Wilcox, Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang, feng.tang, fengwei.yin [-- Attachment #1: Type: text/plain, Size: 601 bytes --] On Mon, 5 Aug 2024 at 11:55, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > So please consider this a "maybe something like this" patch, but that > 'arch_unmap()' really is pretty nasty Actually, the whole powerpc vdso code confused me. It's not the vvar thing that wants this close thing, it's the other ones that have the remap thing. .. and there were two of those error cases that needed to reset the vdso pointer. That all shows just how carefully I was reading this code. New version - still untested, but now I've read through it one more time - attached. Linus [-- Attachment #2: patch.diff --] [-- Type: text/x-patch, Size: 6923 bytes --] arch/powerpc/include/asm/mmu_context.h | 9 --------- arch/powerpc/kernel/vdso.c | 17 +++++++++++++++-- arch/x86/include/asm/mmu_context.h | 5 ----- include/asm-generic/mm_hooks.h | 11 +++-------- include/linux/mm_types.h | 2 ++ mm/mmap.c | 15 ++++++--------- 6 files changed, 26 insertions(+), 33 deletions(-) diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h index 37bffa0f7918..a334a1368848 100644 --- a/arch/powerpc/include/asm/mmu_context.h +++ b/arch/powerpc/include/asm/mmu_context.h @@ -260,15 +260,6 @@ static inline void enter_lazy_tlb(struct mm_struct *mm, extern void arch_exit_mmap(struct mm_struct *mm); -static inline void arch_unmap(struct mm_struct *mm, - unsigned long start, unsigned long end) -{ - unsigned long vdso_base = (unsigned long)mm->context.vdso; - - if (start <= vdso_base && vdso_base < end) - mm->context.vdso = NULL; -} - #ifdef CONFIG_PPC_MEM_KEYS bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write, bool execute, bool foreign); diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c index 7a2ff9010f17..6fa041a6690a 100644 --- a/arch/powerpc/kernel/vdso.c +++ b/arch/powerpc/kernel/vdso.c @@ -81,6 +81,13 @@ static int vdso64_mremap(const struct vm_special_mapping *sm, struct vm_area_str return vdso_mremap(sm, new_vma, &vdso64_end - &vdso64_start); } +static int vvar_close(const struct vm_special_mapping *sm, + struct vm_area_struct *vma) +{ + struct mm_struct *mm = vma->vm_mm; + mm->context.vdso = NULL; +} + static vm_fault_t vvar_fault(const struct vm_special_mapping *sm, struct vm_area_struct *vma, struct vm_fault *vmf); @@ -92,11 +99,13 @@ static struct vm_special_mapping vvar_spec __ro_after_init = { static struct vm_special_mapping vdso32_spec __ro_after_init = { .name = "[vdso]", .mremap = vdso32_mremap, + .close = vvar_close, }; static struct vm_special_mapping vdso64_spec __ro_after_init = { .name = "[vdso]", .mremap = vdso64_mremap, + .close = vvar_close, }; #ifdef CONFIG_TIME_NS @@ -207,8 +216,10 @@ static int __arch_setup_additional_pages(struct linux_binprm *bprm, int uses_int vma = _install_special_mapping(mm, vdso_base, vvar_size, VM_READ | VM_MAYREAD | VM_IO | VM_DONTDUMP | VM_PFNMAP, &vvar_spec); - if (IS_ERR(vma)) + if (IS_ERR(vma)) { + mm->context.vdso = NULL; return PTR_ERR(vma); + } /* * our vma flags don't have VM_WRITE so by default, the process isn't @@ -223,8 +234,10 @@ static int __arch_setup_additional_pages(struct linux_binprm *bprm, int uses_int vma = _install_special_mapping(mm, vdso_base + vvar_size, vdso_size, VM_READ | VM_EXEC | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC, vdso_spec); - if (IS_ERR(vma)) + if (IS_ERR(vma)) { + mm->context.vdso = NULL; do_munmap(mm, vdso_base, vvar_size, NULL); + } return PTR_ERR_OR_ZERO(vma); } diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h index 8dac45a2c7fc..80f2a3187aa6 100644 --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -232,11 +232,6 @@ static inline bool is_64bit_mm(struct mm_struct *mm) } #endif -static inline void arch_unmap(struct mm_struct *mm, unsigned long start, - unsigned long end) -{ -} - /* * We only want to enforce protection keys on the current process * because we effectively have no access to PKRU for other diff --git a/include/asm-generic/mm_hooks.h b/include/asm-generic/mm_hooks.h index 4dbb177d1150..6eea3b3c1e65 100644 --- a/include/asm-generic/mm_hooks.h +++ b/include/asm-generic/mm_hooks.h @@ -1,8 +1,8 @@ /* SPDX-License-Identifier: GPL-2.0 */ /* - * Define generic no-op hooks for arch_dup_mmap, arch_exit_mmap - * and arch_unmap to be included in asm-FOO/mmu_context.h for any - * arch FOO which doesn't need to hook these. + * Define generic no-op hooks for arch_dup_mmap and arch_exit_mmap + * to be included in asm-FOO/mmu_context.h for any arch FOO which + * doesn't need to hook these. */ #ifndef _ASM_GENERIC_MM_HOOKS_H #define _ASM_GENERIC_MM_HOOKS_H @@ -17,11 +17,6 @@ static inline void arch_exit_mmap(struct mm_struct *mm) { } -static inline void arch_unmap(struct mm_struct *mm, - unsigned long start, unsigned long end) -{ -} - static inline bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write, bool execute, bool foreign) { diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 485424979254..ef32d87a3adc 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -1313,6 +1313,8 @@ struct vm_special_mapping { int (*mremap)(const struct vm_special_mapping *sm, struct vm_area_struct *new_vma); + void (*close)(const struct vm_special_mapping *sm, + struct vm_area_struct *vma); }; enum tlb_flush_reason { diff --git a/mm/mmap.c b/mm/mmap.c index d0dfc85b209b..adaaf1ef197a 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -2789,7 +2789,7 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma, * * This function takes a @mas that is either pointing to the previous VMA or set * to MA_START and sets it up to remove the mapping(s). The @len will be - * aligned and any arch_unmap work will be preformed. + * aligned. * * Return: 0 on success and drops the lock if so directed, error and leaves the * lock held otherwise. @@ -2809,16 +2809,12 @@ int do_vmi_munmap(struct vma_iterator *vmi, struct mm_struct *mm, return -EINVAL; /* - * Check if memory is sealed before arch_unmap. - * Prevent unmapping a sealed VMA. + * Check if memory is sealed, prevent unmapping a sealed VMA. * can_modify_mm assumes we have acquired the lock on MM. */ if (unlikely(!can_modify_mm(mm, start, end))) return -EPERM; - /* arch_unmap() might do unmaps itself. */ - arch_unmap(mm, start, end); - /* Find the first overlapping VMA */ vma = vma_find(vmi, end); if (!vma) { @@ -3232,14 +3228,12 @@ int do_vma_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma, struct mm_struct *mm = vma->vm_mm; /* - * Check if memory is sealed before arch_unmap. - * Prevent unmapping a sealed VMA. + * Check if memory is sealed, prevent unmapping a sealed VMA. * can_modify_mm assumes we have acquired the lock on MM. */ if (unlikely(!can_modify_mm(mm, start, end))) return -EPERM; - arch_unmap(mm, start, end); return do_vmi_align_munmap(vmi, vma, mm, start, end, uf, unlock); } @@ -3624,6 +3618,9 @@ static vm_fault_t special_mapping_fault(struct vm_fault *vmf); */ static void special_mapping_close(struct vm_area_struct *vma) { + const struct vm_special_mapping *sm = vma->vm_private_data; + if (sm->close) + sm->close(sm, vma); } static const char *special_mapping_name(struct vm_area_struct *vma) ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression 2024-08-05 19:33 ` Linus Torvalds @ 2024-08-06 2:14 ` Michael Ellerman 2024-08-06 2:17 ` Linus Torvalds 2024-08-06 6:04 ` Oliver Sang 1 sibling, 1 reply; 29+ messages in thread From: Michael Ellerman @ 2024-08-06 2:14 UTC (permalink / raw) To: Linus Torvalds, Jeff Xu, Nicholas Piggin, Christophe Leroy Cc: Pedro Falcato, kernel test robot, Jeff Xu, oe-lkp, lkp, linux-kernel, Andrew Morton, Kees Cook, Liam R. Howlett, Dave Hansen, Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jonathan Corbet, Jorge Lucangeli Obes, Matthew Wilcox, Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang, feng.tang, fengwei.yin Linus Torvalds <torvalds@linux-foundation.org> writes: > On Mon, 5 Aug 2024 at 11:55, Linus Torvalds > <torvalds@linux-foundation.org> wrote: >> >> So please consider this a "maybe something like this" patch, but that >> 'arch_unmap()' really is pretty nasty > > Actually, the whole powerpc vdso code confused me. It's not the vvar > thing that wants this close thing, it's the other ones that have the > remap thing. > > .. and there were two of those error cases that needed to reset the > vdso pointer. > > That all shows just how carefully I was reading this code. > > New version - still untested, but now I've read through it one more > time - attached. Needs a slight tweak to compile, vvar_close() needs to return void. And should probably be renamed vdso_close(). Diff below if anyone else wants to test it. I'm testing it now, but it should do what we need. cheers diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c index 6fa041a6690a..431b46976db8 100644 --- a/arch/powerpc/kernel/vdso.c +++ b/arch/powerpc/kernel/vdso.c @@ -81,8 +81,8 @@ static int vdso64_mremap(const struct vm_special_mapping *sm, struct vm_area_str return vdso_mremap(sm, new_vma, &vdso64_end - &vdso64_start); } -static int vvar_close(const struct vm_special_mapping *sm, - struct vm_area_struct *vma) +static void vdso_close(const struct vm_special_mapping *sm, + struct vm_area_struct *vma) { struct mm_struct *mm = vma->vm_mm; mm->context.vdso = NULL; @@ -99,13 +99,13 @@ static struct vm_special_mapping vvar_spec __ro_after_init = { static struct vm_special_mapping vdso32_spec __ro_after_init = { .name = "[vdso]", .mremap = vdso32_mremap, - .close = vvar_close, + .close = vdso_close, }; static struct vm_special_mapping vdso64_spec __ro_after_init = { .name = "[vdso]", .mremap = vdso64_mremap, - .close = vvar_close, + .close = vdso_close, }; #ifdef CONFIG_TIME_NS ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression 2024-08-06 2:14 ` Michael Ellerman @ 2024-08-06 2:17 ` Linus Torvalds 2024-08-06 12:03 ` Michael Ellerman 0 siblings, 1 reply; 29+ messages in thread From: Linus Torvalds @ 2024-08-06 2:17 UTC (permalink / raw) To: Michael Ellerman Cc: Jeff Xu, Nicholas Piggin, Christophe Leroy, Pedro Falcato, kernel test robot, Jeff Xu, oe-lkp, lkp, linux-kernel, Andrew Morton, Kees Cook, Liam R. Howlett, Dave Hansen, Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jonathan Corbet, Jorge Lucangeli Obes, Matthew Wilcox, Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang, feng.tang, fengwei.yin On Mon, 5 Aug 2024 at 19:14, Michael Ellerman <mpe@ellerman.id.au> wrote: > > Needs a slight tweak to compile, vvar_close() needs to return void. Ack, shows just how untested it was. > And should probably be renamed vdso_close(). .. and that was due to the initial confusion that I then fixed, but didn't fix the naming. So yes, those fixes look ObviouslyCorrect(tm). Linus ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression 2024-08-06 2:17 ` Linus Torvalds @ 2024-08-06 12:03 ` Michael Ellerman 2024-08-06 14:43 ` Linus Torvalds 0 siblings, 1 reply; 29+ messages in thread From: Michael Ellerman @ 2024-08-06 12:03 UTC (permalink / raw) To: Linus Torvalds Cc: Jeff Xu, Nicholas Piggin, Christophe Leroy, Pedro Falcato, kernel test robot, Jeff Xu, oe-lkp, lkp, linux-kernel, Andrew Morton, Kees Cook, Liam R. Howlett, Dave Hansen, Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jonathan Corbet, Jorge Lucangeli Obes, Matthew Wilcox, Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang, feng.tang, fengwei.yin Linus Torvalds <torvalds@linux-foundation.org> writes: > On Mon, 5 Aug 2024 at 19:14, Michael Ellerman <mpe@ellerman.id.au> wrote: >> >> Needs a slight tweak to compile, vvar_close() needs to return void. > > Ack, shows just how untested it was. > >> And should probably be renamed vdso_close(). > > .. and that was due to the initial confusion that I then fixed, but > didn't fix the naming. Ack. > So yes, those fixes look ObviouslyCorrect(tm). Needs another slight tweak to work correctly. Diff below. With that our sigreturn_vdso selftest passes, and the CRIU vdso tests pass also. So LGTM. I'm not sure of the urgency on this, do you want to apply it directly? If so feel free to add my tested-by/sob etc. Or should I turn it into a series and post it? cheers diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c index 431b46976db8..ed5ac4af4d83 100644 --- a/arch/powerpc/kernel/vdso.c +++ b/arch/powerpc/kernel/vdso.c @@ -85,6 +85,15 @@ static void vdso_close(const struct vm_special_mapping *sm, struct vm_area_struct *vma) { struct mm_struct *mm = vma->vm_mm; + + /* + * close() is called for munmap() but also for mremap(). In the mremap() + * case the vdso pointer has already been updated by the mremap() hook + * above, so it must not be set to NULL here. + */ + if (vma->vm_start != (unsigned long)mm->context.vdso) + return; + mm->context.vdso = NULL; } ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression 2024-08-06 12:03 ` Michael Ellerman @ 2024-08-06 14:43 ` Linus Torvalds 2024-08-07 12:26 ` Michael Ellerman 0 siblings, 1 reply; 29+ messages in thread From: Linus Torvalds @ 2024-08-06 14:43 UTC (permalink / raw) To: Michael Ellerman Cc: Jeff Xu, Nicholas Piggin, Christophe Leroy, Pedro Falcato, kernel test robot, Jeff Xu, oe-lkp, lkp, linux-kernel, Andrew Morton, Kees Cook, Liam R. Howlett, Dave Hansen, Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jonathan Corbet, Jorge Lucangeli Obes, Matthew Wilcox, Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang, feng.tang, fengwei.yin On Tue, 6 Aug 2024 at 05:03, Michael Ellerman <mpe@ellerman.id.au> wrote: > > Or should I turn it into a series and post it? I think post it as a single working patch rather than as a series that breaks things and then fixes it. And considering that you did all the testing and found the problems, just take ownership of it and make it a "Suggested-by: Linus" or something. That's what my original patch was anyway: "something like this". Linus ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression 2024-08-06 14:43 ` Linus Torvalds @ 2024-08-07 12:26 ` Michael Ellerman 0 siblings, 0 replies; 29+ messages in thread From: Michael Ellerman @ 2024-08-07 12:26 UTC (permalink / raw) To: Linus Torvalds Cc: Jeff Xu, Nicholas Piggin, Christophe Leroy, Pedro Falcato, kernel test robot, Jeff Xu, oe-lkp, lkp, linux-kernel, Andrew Morton, Kees Cook, Liam R. Howlett, Dave Hansen, Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jonathan Corbet, Jorge Lucangeli Obes, Matthew Wilcox, Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang, feng.tang, fengwei.yin Linus Torvalds <torvalds@linux-foundation.org> writes: > On Tue, 6 Aug 2024 at 05:03, Michael Ellerman <mpe@ellerman.id.au> wrote: >> >> Or should I turn it into a series and post it? > > I think post it as a single working patch rather than as a series that > breaks things and then fixes it. It splits nicely with no breakage along the way. > And considering that you did all the testing and found the problems, > just take ownership of it and make it a "Suggested-by: Linus" or > something. Sure. cheers ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression 2024-08-05 19:33 ` Linus Torvalds 2024-08-06 2:14 ` Michael Ellerman @ 2024-08-06 6:04 ` Oliver Sang 2024-08-06 14:38 ` Linus Torvalds 2024-08-06 21:37 ` Pedro Falcato 1 sibling, 2 replies; 29+ messages in thread From: Oliver Sang @ 2024-08-06 6:04 UTC (permalink / raw) To: Linus Torvalds Cc: Jeff Xu, Michael Ellerman, Nicholas Piggin, Christophe Leroy, Pedro Falcato, Jeff Xu, oe-lkp, lkp, linux-kernel, Andrew Morton, Kees Cook, Liam R. Howlett, Dave Hansen, Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jonathan Corbet, Jorge Lucangeli Obes, Matthew Wilcox, Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang, feng.tang, fengwei.yin, oliver.sang hi, Linus, On Mon, Aug 05, 2024 at 12:33:58PM -0700, Linus Torvalds wrote: > On Mon, 5 Aug 2024 at 11:55, Linus Torvalds > <torvalds@linux-foundation.org> wrote: > > > > So please consider this a "maybe something like this" patch, but that > > 'arch_unmap()' really is pretty nasty > > Actually, the whole powerpc vdso code confused me. It's not the vvar > thing that wants this close thing, it's the other ones that have the > remap thing. > > .. and there were two of those error cases that needed to reset the > vdso pointer. > > That all shows just how carefully I was reading this code. > > New version - still untested, but now I've read through it one more > time - attached. we tested this version by applying it directly upon 8be7258aad, but seems it have little impact to performance. still similar regression if comparing to ff388fe5c4. (the data for 8be7258aad and ff388fe5c4 are a little different with what we have in previous report, since we rerun tests by gcc-12 compiler. 0day team change back to gcc-12 from gcc-13 recently due to some issues) ========================================================================================= compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: gcc-12/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/pagemove/stress-ng/60s commit: ff388fe5c4 ("mseal: wire up mseal syscall") 8be7258aad ("mseal: add mseal syscall") 4605212a16 <--- your patch ff388fe5c481d39c 8be7258aad44b5e25977a98db13 4605212a162071afdd9c713e936 ---------------- --------------------------- --------------------------- %stddev %change %stddev %change %stddev \ | \ | \ 4958 +1.3% 5024 +1.2% 5020 time.percent_of_cpu_this_job_got 2916 +1.5% 2960 +1.4% 2957 time.system_time 65.85 -7.0% 61.27 -7.0% 61.23 time.user_time 41535129 -4.5% 39669773 -4.3% 39746835 proc-vmstat.numa_hit 41465484 -4.5% 39602956 -4.3% 39677556 proc-vmstat.numa_local 77303973 -4.6% 73780662 -4.4% 73912128 proc-vmstat.pgalloc_normal 77022096 -4.6% 73502058 -4.4% 73637326 proc-vmstat.pgfree 18381956 -4.9% 17473438 -5.0% 17457167 stress-ng.pagemove.ops 306349 -4.9% 291188 -5.0% 290931 stress-ng.pagemove.ops_per_sec 209930 -6.2% 196996 ± 2% -7.6% 193911 stress-ng.pagemove.page_remaps_per_sec 4958 +1.3% 5024 +1.2% 5020 stress-ng.time.percent_of_cpu_this_job_got 2916 +1.5% 2960 +1.4% 2957 stress-ng.time.system_time 1.13 -2.1% 1.10 -2.2% 1.10 perf-stat.i.cpi 0.89 +2.2% 0.91 +2.1% 0.91 perf-stat.i.ipc 1.04 -7.2% 0.97 -7.1% 0.97 perf-stat.overall.MPKI 1.13 -2.3% 1.10 -2.2% 1.10 perf-stat.overall.cpi 1082 +5.4% 1140 +5.3% 1139 perf-stat.overall.cycles-between-cache-misses 0.89 +2.3% 0.91 +2.3% 0.91 perf-stat.overall.ipc 192.79 -3.9% 185.32 ± 2% -2.4% 188.21 ± 3% perf-stat.ps.cpu-migrations 1.048e+13 +2.8% 1.078e+13 +2.6% 1.075e+13 perf-stat.total.instructions 74.97 -1.9 73.07 -2.1 72.88 perf-profile.calltrace.cycles-pp.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 36.79 -1.6 35.22 -1.6 35.17 perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 24.98 -1.3 23.64 -1.4 23.57 perf-profile.calltrace.cycles-pp.copy_vma.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 19.91 -1.1 18.85 -1.1 18.83 perf-profile.calltrace.cycles-pp.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 10.70 ± 3% -0.9 9.84 ± 3% -0.6 10.06 ± 2% perf-profile.calltrace.cycles-pp.kthread.ret_from_fork.ret_from_fork_asm 10.70 ± 3% -0.9 9.84 ± 3% -0.6 10.06 ± 2% perf-profile.calltrace.cycles-pp.ret_from_fork.ret_from_fork_asm 10.70 ± 3% -0.9 9.84 ± 3% -0.6 10.06 ± 2% perf-profile.calltrace.cycles-pp.ret_from_fork_asm 10.64 ± 3% -0.9 9.79 ± 3% -0.6 10.02 ± 2% perf-profile.calltrace.cycles-pp.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm 10.63 ± 3% -0.9 9.78 ± 3% -0.6 10.01 ± 2% perf-profile.calltrace.cycles-pp.handle_softirqs.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork 10.63 ± 3% -0.9 9.78 ± 3% -0.6 10.01 ± 2% perf-profile.calltrace.cycles-pp.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm 10.63 ± 3% -0.9 9.78 ± 3% -0.6 10.01 ± 2% perf-profile.calltrace.cycles-pp.rcu_core.handle_softirqs.run_ksoftirqd.smpboot_thread_fn.kthread 10.59 ± 3% -0.8 9.74 ± 3% -0.6 9.97 ± 2% perf-profile.calltrace.cycles-pp.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd.smpboot_thread_fn 14.77 -0.8 14.00 -0.9 13.91 perf-profile.calltrace.cycles-pp.vma_merge.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 1.48 -0.5 0.99 -0.5 0.99 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 5.95 ± 3% -0.5 5.47 ± 3% -0.4 5.59 ± 2% perf-profile.calltrace.cycles-pp.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd 7.88 -0.4 7.48 -0.4 7.44 perf-profile.calltrace.cycles-pp.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 4.62 ± 3% -0.4 4.25 ± 3% -0.3 4.35 ± 2% perf-profile.calltrace.cycles-pp.__slab_free.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs 6.72 -0.4 6.36 -0.3 6.39 perf-profile.calltrace.cycles-pp.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 6.15 -0.3 5.82 -0.4 5.80 perf-profile.calltrace.cycles-pp.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 6.11 -0.3 5.78 -0.3 5.82 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap 5.78 -0.3 5.49 -0.3 5.46 perf-profile.calltrace.cycles-pp.move_ptes.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64 5.54 -0.3 5.25 -0.3 5.22 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap 5.56 -0.3 5.28 -0.3 5.24 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_merge.copy_vma.move_vma.__do_sys_mremap 5.19 -0.3 4.92 -0.3 4.89 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap 5.20 -0.3 4.94 -0.3 4.91 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma.move_vma 3.20 ± 4% -0.3 2.94 ± 3% -0.2 3.01 ± 2% perf-profile.calltrace.cycles-pp.vm_area_free_rcu_cb.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd 4.09 -0.2 3.85 -0.2 3.85 perf-profile.calltrace.cycles-pp.vm_area_dup.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 4.68 -0.2 4.45 -0.3 4.41 perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma 2.63 ± 3% -0.2 2.42 ± 3% -0.2 2.48 ± 2% perf-profile.calltrace.cycles-pp.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core.handle_softirqs 2.36 ± 2% -0.2 2.16 ± 4% -0.2 2.17 ± 2% perf-profile.calltrace.cycles-pp.__memcpy.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_complete 3.56 -0.2 3.36 -0.2 3.37 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma.__do_sys_mremap 4.00 -0.2 3.81 -0.2 3.78 perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma 1.35 -0.2 1.16 -0.2 1.17 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap 3.40 -0.2 3.22 -0.2 3.21 perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma.__do_sys_mremap 2.22 -0.2 2.06 -0.2 2.05 perf-profile.calltrace.cycles-pp.find_vma_prev.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 0.96 -0.2 0.82 -0.1 0.82 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.do_munmap.mremap_to 3.25 -0.1 3.10 -0.2 3.10 perf-profile.calltrace.cycles-pp.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 1.81 ± 4% -0.1 1.67 ± 3% -0.1 1.71 ± 2% perf-profile.calltrace.cycles-pp.__slab_free.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core 1.97 ± 3% -0.1 1.83 ± 3% -0.2 1.81 ± 4% perf-profile.calltrace.cycles-pp.mod_objcg_state.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.__split_vma 2.26 -0.1 2.12 -0.2 2.11 perf-profile.calltrace.cycles-pp.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 3.10 -0.1 2.96 -0.1 2.99 perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap 3.13 -0.1 2.99 -0.1 3.00 perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma 2.97 -0.1 2.85 -0.2 2.82 perf-profile.calltrace.cycles-pp.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 2.05 -0.1 1.93 -0.1 1.92 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_merge.copy_vma.move_vma.__do_sys_mremap 8.26 -0.1 8.14 -0.1 8.16 perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 2.45 -0.1 2.34 -0.1 2.34 perf-profile.calltrace.cycles-pp.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma 2.43 -0.1 2.32 -0.1 2.32 perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap 1.75 ± 2% -0.1 1.64 ± 3% -0.1 1.61 ± 3% perf-profile.calltrace.cycles-pp.vma_prepare.vma_merge.copy_vma.move_vma.__do_sys_mremap 0.54 -0.1 0.44 ± 37% -0.2 0.36 ± 63% perf-profile.calltrace.cycles-pp.mas_find.find_vma_prev.copy_vma.move_vma.__do_sys_mremap 2.21 -0.1 2.11 -0.1 2.11 perf-profile.calltrace.cycles-pp.native_flush_tlb_one_user.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables 1.27 ± 2% -0.1 1.16 ± 4% -0.1 1.18 ± 2% perf-profile.calltrace.cycles-pp.__memcpy.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge 1.32 ± 3% -0.1 1.22 ± 3% -0.1 1.25 perf-profile.calltrace.cycles-pp.rcu_cblist_dequeue.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd 1.85 -0.1 1.76 -0.1 1.76 perf-profile.calltrace.cycles-pp.vma_link.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 2.14 ± 2% -0.1 2.05 ± 2% -0.1 2.00 ± 2% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap 1.40 -0.1 1.31 -0.1 1.30 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap 1.77 ± 3% -0.1 1.68 ± 2% -0.1 1.64 ± 2% perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.anon_vma_clone.__split_vma.do_vmi_align_munmap 1.39 -0.1 1.30 -0.1 1.30 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma.move_vma 1.24 -0.1 1.16 -0.1 1.16 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap 1.40 ± 3% -0.1 1.32 ± 4% -0.1 1.26 ± 5% perf-profile.calltrace.cycles-pp.down_write.vma_prepare.vma_merge.copy_vma.move_vma 0.94 -0.1 0.86 -0.1 0.86 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap 1.23 -0.1 1.15 -0.1 1.15 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma 1.54 -0.1 1.46 -0.1 1.46 perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap 0.73 -0.1 0.67 -0.1 0.67 perf-profile.calltrace.cycles-pp.mas_walk.find_vma_prev.copy_vma.move_vma.__do_sys_mremap 1.15 -0.1 1.09 -0.1 1.09 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap 0.60 ± 2% -0.1 0.54 -0.1 0.54 perf-profile.calltrace.cycles-pp.security_mmap_addr.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64 1.27 -0.1 1.21 -0.1 1.21 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma 0.72 -0.1 0.66 -0.1 0.65 perf-profile.calltrace.cycles-pp.mas_prev.vma_merge.copy_vma.move_vma.__do_sys_mremap 0.70 ± 2% -0.1 0.64 ± 3% -0.1 0.64 ± 4% perf-profile.calltrace.cycles-pp.mod_objcg_state.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.copy_vma 0.79 -0.1 0.73 -0.1 0.73 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma 0.80 ± 2% -0.1 0.75 -0.1 0.74 ± 2% perf-profile.calltrace.cycles-pp.__call_rcu_common.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge 0.78 -0.1 0.72 -0.1 0.72 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge 1.02 -0.1 0.96 -0.1 0.96 perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region 1.63 -0.1 1.58 -0.1 1.57 perf-profile.calltrace.cycles-pp.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.62 -0.0 0.58 -0.1 0.57 perf-profile.calltrace.cycles-pp.mas_prev_setup.mas_prev.vma_merge.copy_vma.move_vma 0.60 ± 3% -0.0 0.56 ± 3% -0.0 0.57 ± 2% perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core 0.86 -0.0 0.81 -0.0 0.81 perf-profile.calltrace.cycles-pp.mtree_load.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64 0.67 -0.0 0.62 -0.0 0.63 perf-profile.calltrace.cycles-pp.percpu_counter_add_batch.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 1.02 -0.0 0.97 -0.0 0.97 perf-profile.calltrace.cycles-pp.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.76 ± 2% -0.0 0.71 -0.0 0.72 ± 2% perf-profile.calltrace.cycles-pp.allocate_slab.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma 0.70 -0.0 0.66 -0.0 0.66 perf-profile.calltrace.cycles-pp.__call_rcu_common.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 0.67 ± 2% -0.0 0.63 -0.0 0.63 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.mremap 0.81 -0.0 0.77 -0.0 0.77 perf-profile.calltrace.cycles-pp.mtree_load.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 0.56 -0.0 0.51 -0.1 0.44 ± 40% perf-profile.calltrace.cycles-pp.mas_walk.mas_prev_setup.mas_prev.vma_merge.copy_vma 0.98 -0.0 0.93 -0.0 0.93 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 0.69 -0.0 0.65 -0.0 0.65 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma 0.78 -0.0 0.74 -0.0 0.74 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_link.copy_vma.move_vma.__do_sys_mremap 1.12 -0.0 1.08 -0.0 1.07 perf-profile.calltrace.cycles-pp.clear_bhb_loop.mremap 0.68 -0.0 0.65 -0.0 0.65 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_link.copy_vma.move_vma.__do_sys_mremap 1.00 -0.0 0.97 -0.0 0.97 perf-profile.calltrace.cycles-pp.mt_find.vma_merge.copy_vma.move_vma.__do_sys_mremap 0.62 -0.0 0.59 -0.0 0.59 perf-profile.calltrace.cycles-pp.mas_prev_slot.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 0.88 -0.0 0.85 -0.0 0.85 perf-profile.calltrace.cycles-pp.userfaultfd_unmap_complete.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 1.15 -0.0 1.12 -0.0 1.14 perf-profile.calltrace.cycles-pp.anon_vma_clone.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 0.60 -0.0 0.57 ± 2% -0.0 0.57 perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas 0.59 -0.0 0.56 -0.0 0.55 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.mremap 0.62 ± 2% -0.0 0.59 ± 2% -0.0 0.58 perf-profile.calltrace.cycles-pp.get_old_pud.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64 0.65 -0.0 0.63 -0.0 0.62 perf-profile.calltrace.cycles-pp.mas_update_gap.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma 0.67 +0.1 0.74 +0.1 0.74 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__madvise 0.76 +0.1 0.84 +0.1 0.83 perf-profile.calltrace.cycles-pp.__madvise 0.66 +0.1 0.74 +0.1 0.74 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise 0.63 +0.1 0.71 +0.1 0.71 perf-profile.calltrace.cycles-pp.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise 0.62 +0.1 0.70 +0.1 0.70 perf-profile.calltrace.cycles-pp.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise 3.47 +0.1 3.55 +0.1 3.56 perf-profile.calltrace.cycles-pp.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma 87.67 +0.8 88.47 +0.6 88.26 perf-profile.calltrace.cycles-pp.mremap 0.00 +0.9 0.86 +0.9 0.86 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.do_munmap 0.00 +0.9 0.88 +0.9 0.88 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.mremap_to.__do_sys_mremap 0.00 +0.9 0.90 ± 2% +0.9 0.89 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.move_vma 84.82 +1.0 85.80 +0.8 85.60 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.mremap 84.66 +1.0 85.65 +0.8 85.45 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 83.71 +1.0 84.73 +0.8 84.55 perf-profile.calltrace.cycles-pp.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 0.00 +1.1 1.10 +1.1 1.10 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64 0.00 +1.2 1.21 +1.2 1.22 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to 2.09 +1.5 3.60 +1.5 3.60 perf-profile.calltrace.cycles-pp.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.00 +1.5 1.51 +1.5 1.49 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap 1.59 +1.5 3.12 +1.5 3.13 perf-profile.calltrace.cycles-pp.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64 0.00 +1.6 1.62 +1.6 1.62 perf-profile.calltrace.cycles-pp.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.00 +1.7 1.72 +1.7 1.73 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap 0.00 +2.0 2.01 +2.0 1.99 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 5.34 +3.0 8.38 +3.0 8.37 perf-profile.calltrace.cycles-pp.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 75.13 -1.9 73.22 -2.1 73.04 perf-profile.children.cycles-pp.move_vma 37.01 -1.6 35.43 -1.6 35.38 perf-profile.children.cycles-pp.do_vmi_align_munmap 25.06 -1.3 23.71 -1.4 23.65 perf-profile.children.cycles-pp.copy_vma 20.00 -1.1 18.94 -1.1 18.91 perf-profile.children.cycles-pp.__split_vma 19.86 -1.0 18.87 -1.0 18.89 perf-profile.children.cycles-pp.rcu_core 19.84 -1.0 18.85 -1.0 18.87 perf-profile.children.cycles-pp.rcu_do_batch 19.88 -1.0 18.89 -1.0 18.91 perf-profile.children.cycles-pp.handle_softirqs 10.70 ± 3% -0.9 9.84 ± 3% -0.6 10.06 ± 2% perf-profile.children.cycles-pp.kthread 10.70 ± 3% -0.9 9.84 ± 3% -0.6 10.06 ± 2% perf-profile.children.cycles-pp.ret_from_fork 10.70 ± 3% -0.9 9.84 ± 3% -0.6 10.06 ± 2% perf-profile.children.cycles-pp.ret_from_fork_asm 10.64 ± 3% -0.9 9.79 ± 3% -0.6 10.02 ± 2% perf-profile.children.cycles-pp.smpboot_thread_fn 10.63 ± 3% -0.9 9.78 ± 3% -0.6 10.01 ± 2% perf-profile.children.cycles-pp.run_ksoftirqd 17.53 -0.8 16.70 -0.8 16.72 perf-profile.children.cycles-pp.kmem_cache_free 15.28 -0.8 14.47 -0.8 14.48 perf-profile.children.cycles-pp.kmem_cache_alloc_noprof 15.16 -0.8 14.37 -0.9 14.29 perf-profile.children.cycles-pp.vma_merge 12.18 -0.6 11.54 -0.7 11.49 perf-profile.children.cycles-pp.mas_wr_store_entry 11.98 -0.6 11.36 -0.7 11.30 perf-profile.children.cycles-pp.mas_store_prealloc 12.11 -0.6 11.51 -0.6 11.51 perf-profile.children.cycles-pp.__slab_free 10.86 -0.6 10.26 -0.6 10.30 perf-profile.children.cycles-pp.vm_area_dup 9.89 -0.5 9.40 -0.6 9.33 perf-profile.children.cycles-pp.mas_wr_node_store 8.36 -0.4 7.92 -0.4 7.91 perf-profile.children.cycles-pp.__memcg_slab_post_alloc_hook 7.98 -0.4 7.58 -0.4 7.55 perf-profile.children.cycles-pp.move_page_tables 6.69 -0.4 6.33 -0.4 6.32 perf-profile.children.cycles-pp.vma_complete 5.86 -0.3 5.56 -0.3 5.53 perf-profile.children.cycles-pp.move_ptes 5.11 -0.3 4.81 -0.3 4.80 perf-profile.children.cycles-pp.mas_preallocate 6.05 -0.3 5.75 -0.3 5.76 perf-profile.children.cycles-pp.vm_area_free_rcu_cb 2.98 ± 2% -0.3 2.73 ± 4% -0.2 2.75 ± 2% perf-profile.children.cycles-pp.__memcpy 3.48 -0.2 3.26 -0.2 3.27 perf-profile.children.cycles-pp.___slab_alloc 3.46 ± 2% -0.2 3.26 -0.2 3.27 ± 2% perf-profile.children.cycles-pp.mod_objcg_state 2.91 -0.2 2.73 -0.2 2.73 perf-profile.children.cycles-pp.mas_alloc_nodes 2.43 -0.2 2.25 -0.2 2.25 perf-profile.children.cycles-pp.find_vma_prev 3.47 -0.2 3.29 -0.2 3.23 ± 2% perf-profile.children.cycles-pp.down_write 3.46 -0.2 3.28 -0.2 3.27 perf-profile.children.cycles-pp.flush_tlb_mm_range 4.22 -0.2 4.06 -0.2 4.05 perf-profile.children.cycles-pp.anon_vma_clone 3.32 -0.2 3.17 -0.1 3.18 perf-profile.children.cycles-pp.__memcg_slab_free_hook 3.35 -0.2 3.20 -0.2 3.20 perf-profile.children.cycles-pp.mas_store_gfp 2.22 -0.1 2.07 -0.1 2.07 perf-profile.children.cycles-pp.__cond_resched 3.18 -0.1 3.04 -0.1 3.05 perf-profile.children.cycles-pp.unmap_vmas 2.05 ± 2% -0.1 1.91 -0.1 1.93 ± 2% perf-profile.children.cycles-pp.allocate_slab 2.24 -0.1 2.11 ± 2% -0.2 2.08 ± 2% perf-profile.children.cycles-pp.vma_prepare 2.12 -0.1 2.00 -0.1 2.00 perf-profile.children.cycles-pp.__call_rcu_common 2.66 -0.1 2.53 -0.1 2.54 perf-profile.children.cycles-pp.mtree_load 2.46 -0.1 2.34 -0.1 2.34 perf-profile.children.cycles-pp.rcu_cblist_dequeue 2.49 -0.1 2.38 -0.1 2.38 perf-profile.children.cycles-pp.flush_tlb_func 8.32 -0.1 8.21 -0.1 8.22 perf-profile.children.cycles-pp.unmap_region 2.48 -0.1 2.37 -0.1 2.37 perf-profile.children.cycles-pp.unmap_page_range 2.23 -0.1 2.13 -0.1 2.13 perf-profile.children.cycles-pp.native_flush_tlb_one_user 1.77 -0.1 1.67 -0.1 1.67 perf-profile.children.cycles-pp.mas_wr_walk 1.88 -0.1 1.78 -0.1 1.78 perf-profile.children.cycles-pp.vma_link 1.40 -0.1 1.31 -0.1 1.31 perf-profile.children.cycles-pp.shuffle_freelist 1.84 -0.1 1.75 -0.1 1.76 ± 2% perf-profile.children.cycles-pp.up_write 0.97 ± 2% -0.1 0.88 -0.1 0.88 perf-profile.children.cycles-pp.rcu_all_qs 1.03 -0.1 0.95 -0.1 0.95 perf-profile.children.cycles-pp.mas_prev 0.92 -0.1 0.85 -0.1 0.84 perf-profile.children.cycles-pp.mas_prev_setup 1.58 -0.1 1.50 -0.1 1.50 perf-profile.children.cycles-pp.zap_pmd_range 1.24 -0.1 1.17 -0.1 1.16 perf-profile.children.cycles-pp.mas_prev_slot 1.58 -0.1 1.51 -0.1 1.51 perf-profile.children.cycles-pp.mas_update_gap 0.62 -0.1 0.56 -0.1 0.56 perf-profile.children.cycles-pp.security_mmap_addr 0.49 ± 2% -0.1 0.43 -0.1 0.44 ± 2% perf-profile.children.cycles-pp.setup_object 0.90 -0.1 0.84 -0.1 0.85 perf-profile.children.cycles-pp.percpu_counter_add_batch 0.98 -0.1 0.92 -0.1 0.93 perf-profile.children.cycles-pp.mas_pop_node 0.85 -0.1 0.80 -0.1 0.79 perf-profile.children.cycles-pp._raw_spin_lock_irqsave 1.68 -0.1 1.62 -0.1 1.61 perf-profile.children.cycles-pp.__get_unmapped_area 1.23 -0.1 1.18 -0.1 1.17 perf-profile.children.cycles-pp.__pte_offset_map_lock 1.08 -0.1 1.03 -0.1 1.03 perf-profile.children.cycles-pp.zap_pte_range 0.69 ± 2% -0.0 0.64 -0.0 0.65 perf-profile.children.cycles-pp.syscall_return_via_sysret 1.04 -0.0 1.00 -0.1 0.99 perf-profile.children.cycles-pp.vma_to_resize 1.08 -0.0 1.04 -0.0 1.04 perf-profile.children.cycles-pp.mas_leaf_max_gap 0.51 ± 3% -0.0 0.47 -0.0 0.49 ± 4% perf-profile.children.cycles-pp.anon_vma_interval_tree_insert 1.18 -0.0 1.14 -0.1 1.13 perf-profile.children.cycles-pp.clear_bhb_loop 0.57 -0.0 0.53 -0.0 0.53 perf-profile.children.cycles-pp.mas_wr_end_piv 0.43 -0.0 0.40 -0.0 0.39 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath 1.14 -0.0 1.10 -0.0 1.10 perf-profile.children.cycles-pp.mt_find 0.46 ± 7% -0.0 0.42 ± 2% -0.0 0.42 perf-profile.children.cycles-pp._raw_spin_lock 0.62 -0.0 0.58 -0.0 0.59 perf-profile.children.cycles-pp.__put_partials 0.90 -0.0 0.87 -0.0 0.87 perf-profile.children.cycles-pp.userfaultfd_unmap_complete 0.46 ± 3% -0.0 0.42 ± 3% -0.0 0.42 ± 3% perf-profile.children.cycles-pp.__alloc_pages_noprof 0.61 -0.0 0.58 -0.0 0.58 perf-profile.children.cycles-pp.entry_SYSCALL_64 0.48 -0.0 0.45 ± 2% -0.0 0.45 perf-profile.children.cycles-pp.mas_prev_range 0.64 -0.0 0.61 -0.0 0.60 perf-profile.children.cycles-pp.get_old_pud 0.31 ± 2% -0.0 0.28 ± 3% -0.0 0.28 ± 3% perf-profile.children.cycles-pp.security_vm_enough_memory_mm 0.33 ± 3% -0.0 0.30 ± 2% -0.0 0.30 ± 3% perf-profile.children.cycles-pp.mas_put_in_tree 0.32 ± 2% -0.0 0.29 ± 2% -0.0 0.30 ± 2% perf-profile.children.cycles-pp.tlb_finish_mmu 0.47 -0.0 0.44 ± 2% -0.0 0.44 perf-profile.children.cycles-pp.rcu_segcblist_enqueue 0.33 -0.0 0.31 -0.0 0.32 perf-profile.children.cycles-pp.mas_destroy 0.40 -0.0 0.39 -0.0 0.38 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack 0.35 -0.0 0.34 -0.0 0.33 perf-profile.children.cycles-pp.__rb_insert_augmented 0.25 ± 4% -0.0 0.23 ± 3% -0.0 0.23 ± 3% perf-profile.children.cycles-pp.rmqueue 0.39 -0.0 0.37 -0.0 0.37 perf-profile.children.cycles-pp.down_write_killable 0.18 ± 3% -0.0 0.17 ± 5% -0.0 0.16 ± 5% perf-profile.children.cycles-pp.cap_vm_enough_memory 0.22 ± 4% -0.0 0.20 ± 3% -0.0 0.20 ± 3% perf-profile.children.cycles-pp.__rmqueue_pcplist 0.21 ± 4% -0.0 0.19 ± 3% -0.0 0.19 ± 3% perf-profile.children.cycles-pp.rmqueue_bulk 0.52 -0.0 0.51 ± 2% -0.0 0.50 perf-profile.children.cycles-pp.__pte_offset_map 0.26 -0.0 0.24 ± 2% -0.0 0.24 ± 2% perf-profile.children.cycles-pp.syscall_exit_to_user_mode 0.30 ± 2% -0.0 0.28 ± 2% -0.0 0.28 ± 3% perf-profile.children.cycles-pp.__vm_enough_memory 0.29 -0.0 0.27 -0.0 0.27 ± 3% perf-profile.children.cycles-pp.tlb_gather_mmu 0.16 ± 2% -0.0 0.14 ± 3% -0.0 0.14 ± 4% perf-profile.children.cycles-pp.mas_wr_append 0.28 ± 2% -0.0 0.26 -0.0 0.26 perf-profile.children.cycles-pp.khugepaged_enter_vma 0.32 -0.0 0.30 -0.0 0.30 perf-profile.children.cycles-pp.mas_wr_store_setup 0.20 ± 2% -0.0 0.18 ± 2% -0.0 0.18 ± 2% perf-profile.children.cycles-pp.__thp_vma_allowable_orders 0.32 -0.0 0.30 -0.0 0.30 ± 2% perf-profile.children.cycles-pp.pte_offset_map_nolock 0.09 ± 4% -0.0 0.08 ± 5% -0.0 0.07 ± 6% perf-profile.children.cycles-pp.vma_dup_policy 0.36 -0.0 0.35 -0.0 0.35 perf-profile.children.cycles-pp.madvise_vma_behavior 0.16 ± 3% -0.0 0.16 ± 2% -0.0 0.15 ± 3% perf-profile.children.cycles-pp._find_next_bit 0.14 ± 3% +0.0 0.15 ± 2% +0.0 0.15 perf-profile.children.cycles-pp.free_pgd_range 0.08 ± 4% +0.0 0.10 ± 4% +0.0 0.10 ± 4% perf-profile.children.cycles-pp.mm_get_unmapped_area_vmflags 0.78 +0.1 0.85 +0.1 0.85 perf-profile.children.cycles-pp.__madvise 0.63 +0.1 0.71 +0.1 0.71 perf-profile.children.cycles-pp.__x64_sys_madvise 0.63 +0.1 0.70 +0.1 0.70 perf-profile.children.cycles-pp.do_madvise 3.52 +0.1 3.60 +0.1 3.61 perf-profile.children.cycles-pp.free_pgtables 0.00 +0.1 0.09 +0.1 0.09 ± 4% perf-profile.children.cycles-pp.can_modify_mm_madv 1.30 +0.2 1.46 +0.2 1.46 perf-profile.children.cycles-pp.mas_next_slot 88.06 +0.8 88.84 +0.6 88.64 perf-profile.children.cycles-pp.mremap 83.81 +1.0 84.84 +0.8 84.65 perf-profile.children.cycles-pp.__do_sys_mremap 85.98 +1.0 87.02 +0.8 86.82 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe 85.50 +1.1 86.56 +0.9 86.36 perf-profile.children.cycles-pp.do_syscall_64 2.12 +1.5 3.62 +1.5 3.63 perf-profile.children.cycles-pp.do_munmap 40.41 +1.5 41.93 +1.5 41.86 perf-profile.children.cycles-pp.do_vmi_munmap 3.62 +2.4 5.98 +2.3 5.95 perf-profile.children.cycles-pp.mas_walk 5.40 +3.0 8.44 +3.0 8.43 perf-profile.children.cycles-pp.mremap_to 5.26 +3.2 8.48 +3.2 8.47 perf-profile.children.cycles-pp.mas_find 0.00 +5.5 5.46 +5.4 5.45 perf-profile.children.cycles-pp.can_modify_mm 11.49 -0.6 10.92 -0.6 10.93 perf-profile.self.cycles-pp.__slab_free 4.32 -0.2 4.07 -0.3 3.98 ± 2% perf-profile.self.cycles-pp.__memcg_slab_post_alloc_hook 1.96 -0.2 1.80 ± 4% -0.1 1.83 ± 2% perf-profile.self.cycles-pp.__memcpy 2.36 ± 2% -0.1 2.24 ± 2% -0.2 2.19 ± 2% perf-profile.self.cycles-pp.down_write 2.42 -0.1 2.30 -0.1 2.32 perf-profile.self.cycles-pp.rcu_cblist_dequeue 2.33 -0.1 2.22 -0.1 2.23 perf-profile.self.cycles-pp.mtree_load 2.21 -0.1 2.10 -0.1 2.10 perf-profile.self.cycles-pp.native_flush_tlb_one_user 1.62 -0.1 1.54 -0.1 1.55 ± 2% perf-profile.self.cycles-pp.__memcg_slab_free_hook 1.52 -0.1 1.44 -0.1 1.44 perf-profile.self.cycles-pp.mas_wr_walk 1.15 ± 2% -0.1 1.07 -0.1 1.08 perf-profile.self.cycles-pp.shuffle_freelist 1.53 -0.1 1.45 -0.1 1.47 ± 2% perf-profile.self.cycles-pp.up_write 1.44 -0.1 1.36 -0.1 1.36 perf-profile.self.cycles-pp.__call_rcu_common 0.70 ± 2% -0.1 0.62 -0.1 0.63 perf-profile.self.cycles-pp.rcu_all_qs 1.72 -0.1 1.66 -0.1 1.66 perf-profile.self.cycles-pp.mod_objcg_state 3.77 -0.1 3.70 ± 4% -0.2 3.62 ± 2% perf-profile.self.cycles-pp.mas_wr_node_store 0.51 ± 3% -0.1 0.45 -0.1 0.45 perf-profile.self.cycles-pp.security_mmap_addr 0.94 ± 2% -0.1 0.88 ± 4% -0.1 0.88 ± 2% perf-profile.self.cycles-pp.vm_area_dup 1.18 -0.1 1.12 -0.1 1.12 perf-profile.self.cycles-pp.vma_merge 1.38 -0.1 1.33 -0.1 1.32 perf-profile.self.cycles-pp.do_vmi_align_munmap 0.89 -0.1 0.83 -0.1 0.83 perf-profile.self.cycles-pp.___slab_alloc 0.62 -0.1 0.56 ± 2% -0.1 0.56 ± 2% perf-profile.self.cycles-pp.mremap 1.00 -0.1 0.95 -0.1 0.95 perf-profile.self.cycles-pp.mas_preallocate 0.98 -0.1 0.93 -0.0 0.93 perf-profile.self.cycles-pp.move_ptes 0.99 -0.1 0.94 -0.1 0.93 perf-profile.self.cycles-pp.mas_prev_slot 1.09 -0.0 1.04 ± 2% -0.0 1.05 perf-profile.self.cycles-pp.__cond_resched 0.94 -0.0 0.90 -0.1 0.89 perf-profile.self.cycles-pp.vm_area_free_rcu_cb 0.85 -0.0 0.80 -0.0 0.81 perf-profile.self.cycles-pp.mas_pop_node 0.77 -0.0 0.72 -0.0 0.73 perf-profile.self.cycles-pp.percpu_counter_add_batch 0.68 -0.0 0.63 -0.0 0.64 perf-profile.self.cycles-pp.__split_vma 1.17 -0.0 1.13 -0.1 1.12 perf-profile.self.cycles-pp.clear_bhb_loop 0.95 -0.0 0.91 -0.0 0.91 perf-profile.self.cycles-pp.mas_leaf_max_gap 0.79 -0.0 0.75 -0.0 0.76 perf-profile.self.cycles-pp.mas_wr_store_entry 0.44 -0.0 0.40 -0.0 0.40 perf-profile.self.cycles-pp.do_munmap 1.22 -0.0 1.18 -0.0 1.19 perf-profile.self.cycles-pp.move_vma 0.89 -0.0 0.86 -0.0 0.86 perf-profile.self.cycles-pp.mas_store_gfp 0.45 -0.0 0.42 -0.0 0.42 perf-profile.self.cycles-pp.mas_wr_end_piv 0.43 ± 2% -0.0 0.40 -0.0 0.39 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath 0.78 -0.0 0.75 -0.0 0.76 perf-profile.self.cycles-pp.userfaultfd_unmap_complete 0.66 -0.0 0.63 -0.0 0.63 perf-profile.self.cycles-pp.mas_store_prealloc 1.49 -0.0 1.46 -0.0 1.45 perf-profile.self.cycles-pp.kmem_cache_free 0.60 -0.0 0.58 -0.0 0.58 perf-profile.self.cycles-pp.unmap_region 0.86 -0.0 0.83 -0.0 0.83 perf-profile.self.cycles-pp.move_page_tables 0.43 ± 4% -0.0 0.40 -0.0 0.42 ± 4% perf-profile.self.cycles-pp.anon_vma_interval_tree_insert 0.99 -0.0 0.97 -0.0 0.97 perf-profile.self.cycles-pp.mt_find 0.36 ± 3% -0.0 0.33 ± 2% -0.0 0.33 ± 2% perf-profile.self.cycles-pp.syscall_return_via_sysret 0.71 -0.0 0.68 -0.0 0.68 perf-profile.self.cycles-pp.unmap_page_range 0.55 -0.0 0.52 -0.0 0.51 perf-profile.self.cycles-pp.get_old_pud 0.49 -0.0 0.47 -0.0 0.47 perf-profile.self.cycles-pp.find_vma_prev 0.27 -0.0 0.25 -0.0 0.25 ± 2% perf-profile.self.cycles-pp.mas_prev_setup 0.41 -0.0 0.39 -0.0 0.39 perf-profile.self.cycles-pp._raw_spin_lock_irqsave 0.61 -0.0 0.58 -0.0 0.58 perf-profile.self.cycles-pp.copy_vma 0.47 -0.0 0.45 ± 2% -0.0 0.45 perf-profile.self.cycles-pp.flush_tlb_mm_range 0.37 ± 6% -0.0 0.35 ± 2% -0.0 0.35 perf-profile.self.cycles-pp._raw_spin_lock 0.42 ± 2% -0.0 0.40 ± 2% -0.0 0.40 perf-profile.self.cycles-pp.rcu_segcblist_enqueue 0.27 -0.0 0.25 ± 2% -0.0 0.25 ± 2% perf-profile.self.cycles-pp.mas_put_in_tree 0.39 -0.0 0.37 -0.0 0.37 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack 0.44 -0.0 0.42 -0.0 0.42 perf-profile.self.cycles-pp.mas_update_gap 0.49 -0.0 0.47 -0.0 0.48 ± 2% perf-profile.self.cycles-pp.refill_obj_stock 0.27 ± 2% -0.0 0.25 ± 2% -0.0 0.25 perf-profile.self.cycles-pp.tlb_finish_mmu 0.34 -0.0 0.32 -0.0 0.32 ± 2% perf-profile.self.cycles-pp.zap_pmd_range 0.48 -0.0 0.46 -0.0 0.46 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe 0.28 -0.0 0.26 -0.0 0.26 perf-profile.self.cycles-pp.mas_alloc_nodes 0.24 ± 2% -0.0 0.22 -0.0 0.22 perf-profile.self.cycles-pp.mas_prev 0.14 ± 3% -0.0 0.12 ± 2% -0.0 0.12 ± 2% perf-profile.self.cycles-pp.syscall_exit_to_user_mode 0.26 -0.0 0.24 -0.0 0.24 perf-profile.self.cycles-pp.__rb_insert_augmented 0.40 -0.0 0.39 -0.0 0.39 perf-profile.self.cycles-pp.__pte_offset_map_lock 0.28 -0.0 0.26 ± 3% -0.0 0.26 perf-profile.self.cycles-pp.mas_prev_range 0.33 ± 2% -0.0 0.32 -0.0 0.31 perf-profile.self.cycles-pp.zap_pte_range 0.28 -0.0 0.26 -0.0 0.26 perf-profile.self.cycles-pp.flush_tlb_func 0.44 -0.0 0.42 ± 2% -0.0 0.42 perf-profile.self.cycles-pp.__pte_offset_map 0.22 -0.0 0.21 ± 2% -0.0 0.21 ± 2% perf-profile.self.cycles-pp.entry_SYSCALL_64 0.17 -0.0 0.16 -0.0 0.16 perf-profile.self.cycles-pp.__thp_vma_allowable_orders 0.10 -0.0 0.09 -0.0 0.09 ± 3% perf-profile.self.cycles-pp.mod_node_page_state 0.06 -0.0 0.05 -0.0 0.05 perf-profile.self.cycles-pp.vma_dup_policy 0.06 ± 5% +0.0 0.07 +0.0 0.07 ± 4% perf-profile.self.cycles-pp.mm_get_unmapped_area_vmflags 0.11 ± 4% +0.0 0.12 ± 4% +0.0 0.12 ± 2% perf-profile.self.cycles-pp.free_pgd_range 0.21 +0.0 0.22 ± 2% +0.0 0.22 perf-profile.self.cycles-pp.thp_get_unmapped_area_vmflags 0.45 +0.0 0.48 +0.0 0.48 perf-profile.self.cycles-pp.do_vmi_munmap 0.27 +0.0 0.32 +0.0 0.31 perf-profile.self.cycles-pp.free_pgtables 0.36 ± 2% +0.1 0.44 +0.1 0.45 perf-profile.self.cycles-pp.unlink_anon_vmas 1.06 +0.1 1.19 +0.1 1.19 perf-profile.self.cycles-pp.mas_next_slot 1.49 +0.5 2.01 +0.5 2.02 perf-profile.self.cycles-pp.mas_find 0.00 +1.4 1.38 +1.4 1.38 perf-profile.self.cycles-pp.can_modify_mm 3.15 +2.1 5.23 +2.1 5.22 perf-profile.self.cycles-pp.mas_walk > > Linus > arch/powerpc/include/asm/mmu_context.h | 9 --------- > arch/powerpc/kernel/vdso.c | 17 +++++++++++++++-- > arch/x86/include/asm/mmu_context.h | 5 ----- > include/asm-generic/mm_hooks.h | 11 +++-------- > include/linux/mm_types.h | 2 ++ > mm/mmap.c | 15 ++++++--------- > 6 files changed, 26 insertions(+), 33 deletions(-) > > diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h > index 37bffa0f7918..a334a1368848 100644 > --- a/arch/powerpc/include/asm/mmu_context.h > +++ b/arch/powerpc/include/asm/mmu_context.h > @@ -260,15 +260,6 @@ static inline void enter_lazy_tlb(struct mm_struct *mm, > > extern void arch_exit_mmap(struct mm_struct *mm); > > -static inline void arch_unmap(struct mm_struct *mm, > - unsigned long start, unsigned long end) > -{ > - unsigned long vdso_base = (unsigned long)mm->context.vdso; > - > - if (start <= vdso_base && vdso_base < end) > - mm->context.vdso = NULL; > -} > - > #ifdef CONFIG_PPC_MEM_KEYS > bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write, > bool execute, bool foreign); > diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c > index 7a2ff9010f17..6fa041a6690a 100644 > --- a/arch/powerpc/kernel/vdso.c > +++ b/arch/powerpc/kernel/vdso.c > @@ -81,6 +81,13 @@ static int vdso64_mremap(const struct vm_special_mapping *sm, struct vm_area_str > return vdso_mremap(sm, new_vma, &vdso64_end - &vdso64_start); > } > > +static int vvar_close(const struct vm_special_mapping *sm, > + struct vm_area_struct *vma) > +{ > + struct mm_struct *mm = vma->vm_mm; > + mm->context.vdso = NULL; > +} > + > static vm_fault_t vvar_fault(const struct vm_special_mapping *sm, > struct vm_area_struct *vma, struct vm_fault *vmf); > > @@ -92,11 +99,13 @@ static struct vm_special_mapping vvar_spec __ro_after_init = { > static struct vm_special_mapping vdso32_spec __ro_after_init = { > .name = "[vdso]", > .mremap = vdso32_mremap, > + .close = vvar_close, > }; > > static struct vm_special_mapping vdso64_spec __ro_after_init = { > .name = "[vdso]", > .mremap = vdso64_mremap, > + .close = vvar_close, > }; > > #ifdef CONFIG_TIME_NS > @@ -207,8 +216,10 @@ static int __arch_setup_additional_pages(struct linux_binprm *bprm, int uses_int > vma = _install_special_mapping(mm, vdso_base, vvar_size, > VM_READ | VM_MAYREAD | VM_IO | > VM_DONTDUMP | VM_PFNMAP, &vvar_spec); > - if (IS_ERR(vma)) > + if (IS_ERR(vma)) { > + mm->context.vdso = NULL; > return PTR_ERR(vma); > + } > > /* > * our vma flags don't have VM_WRITE so by default, the process isn't > @@ -223,8 +234,10 @@ static int __arch_setup_additional_pages(struct linux_binprm *bprm, int uses_int > vma = _install_special_mapping(mm, vdso_base + vvar_size, vdso_size, > VM_READ | VM_EXEC | VM_MAYREAD | > VM_MAYWRITE | VM_MAYEXEC, vdso_spec); > - if (IS_ERR(vma)) > + if (IS_ERR(vma)) { > + mm->context.vdso = NULL; > do_munmap(mm, vdso_base, vvar_size, NULL); > + } > > return PTR_ERR_OR_ZERO(vma); > } > diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h > index 8dac45a2c7fc..80f2a3187aa6 100644 > --- a/arch/x86/include/asm/mmu_context.h > +++ b/arch/x86/include/asm/mmu_context.h > @@ -232,11 +232,6 @@ static inline bool is_64bit_mm(struct mm_struct *mm) > } > #endif > > -static inline void arch_unmap(struct mm_struct *mm, unsigned long start, > - unsigned long end) > -{ > -} > - > /* > * We only want to enforce protection keys on the current process > * because we effectively have no access to PKRU for other > diff --git a/include/asm-generic/mm_hooks.h b/include/asm-generic/mm_hooks.h > index 4dbb177d1150..6eea3b3c1e65 100644 > --- a/include/asm-generic/mm_hooks.h > +++ b/include/asm-generic/mm_hooks.h > @@ -1,8 +1,8 @@ > /* SPDX-License-Identifier: GPL-2.0 */ > /* > - * Define generic no-op hooks for arch_dup_mmap, arch_exit_mmap > - * and arch_unmap to be included in asm-FOO/mmu_context.h for any > - * arch FOO which doesn't need to hook these. > + * Define generic no-op hooks for arch_dup_mmap and arch_exit_mmap > + * to be included in asm-FOO/mmu_context.h for any arch FOO which > + * doesn't need to hook these. > */ > #ifndef _ASM_GENERIC_MM_HOOKS_H > #define _ASM_GENERIC_MM_HOOKS_H > @@ -17,11 +17,6 @@ static inline void arch_exit_mmap(struct mm_struct *mm) > { > } > > -static inline void arch_unmap(struct mm_struct *mm, > - unsigned long start, unsigned long end) > -{ > -} > - > static inline bool arch_vma_access_permitted(struct vm_area_struct *vma, > bool write, bool execute, bool foreign) > { > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h > index 485424979254..ef32d87a3adc 100644 > --- a/include/linux/mm_types.h > +++ b/include/linux/mm_types.h > @@ -1313,6 +1313,8 @@ struct vm_special_mapping { > > int (*mremap)(const struct vm_special_mapping *sm, > struct vm_area_struct *new_vma); > + void (*close)(const struct vm_special_mapping *sm, > + struct vm_area_struct *vma); > }; > > enum tlb_flush_reason { > diff --git a/mm/mmap.c b/mm/mmap.c > index d0dfc85b209b..adaaf1ef197a 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -2789,7 +2789,7 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma, > * > * This function takes a @mas that is either pointing to the previous VMA or set > * to MA_START and sets it up to remove the mapping(s). The @len will be > - * aligned and any arch_unmap work will be preformed. > + * aligned. > * > * Return: 0 on success and drops the lock if so directed, error and leaves the > * lock held otherwise. > @@ -2809,16 +2809,12 @@ int do_vmi_munmap(struct vma_iterator *vmi, struct mm_struct *mm, > return -EINVAL; > > /* > - * Check if memory is sealed before arch_unmap. > - * Prevent unmapping a sealed VMA. > + * Check if memory is sealed, prevent unmapping a sealed VMA. > * can_modify_mm assumes we have acquired the lock on MM. > */ > if (unlikely(!can_modify_mm(mm, start, end))) > return -EPERM; > > - /* arch_unmap() might do unmaps itself. */ > - arch_unmap(mm, start, end); > - > /* Find the first overlapping VMA */ > vma = vma_find(vmi, end); > if (!vma) { > @@ -3232,14 +3228,12 @@ int do_vma_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma, > struct mm_struct *mm = vma->vm_mm; > > /* > - * Check if memory is sealed before arch_unmap. > - * Prevent unmapping a sealed VMA. > + * Check if memory is sealed, prevent unmapping a sealed VMA. > * can_modify_mm assumes we have acquired the lock on MM. > */ > if (unlikely(!can_modify_mm(mm, start, end))) > return -EPERM; > > - arch_unmap(mm, start, end); > return do_vmi_align_munmap(vmi, vma, mm, start, end, uf, unlock); > } > > @@ -3624,6 +3618,9 @@ static vm_fault_t special_mapping_fault(struct vm_fault *vmf); > */ > static void special_mapping_close(struct vm_area_struct *vma) > { > + const struct vm_special_mapping *sm = vma->vm_private_data; > + if (sm->close) > + sm->close(sm, vma); > } > > static const char *special_mapping_name(struct vm_area_struct *vma) ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression 2024-08-06 6:04 ` Oliver Sang @ 2024-08-06 14:38 ` Linus Torvalds 2024-08-06 21:37 ` Pedro Falcato 1 sibling, 0 replies; 29+ messages in thread From: Linus Torvalds @ 2024-08-06 14:38 UTC (permalink / raw) To: Oliver Sang Cc: Jeff Xu, Michael Ellerman, Nicholas Piggin, Christophe Leroy, Pedro Falcato, Jeff Xu, oe-lkp, lkp, linux-kernel, Andrew Morton, Kees Cook, Liam R. Howlett, Dave Hansen, Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jonathan Corbet, Jorge Lucangeli Obes, Matthew Wilcox, Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang, feng.tang, fengwei.yin On Mon, 5 Aug 2024 at 23:05, Oliver Sang <oliver.sang@intel.com> wrote: > > > New version - still untested, but now I've read through it one more > > time - attached. > > we tested this version by applying it directly upon 8be7258aad, but seems it > have little impact to performance. still similar regression if comparing to > ff388fe5c4. Note that that patch (and Michael's fixes for ppc on top) in itself doesn't fix any performance issue. But getting rid of arch_unmap() means that now the can_modify_mm() in do_vmi_munmap() is right above the "vma_find()" (and can in fact be moved below it and into do_vmi_align_munmap), and that means that at least the unmap paths don't need the vma lookup of can_modify_mm() at all, because they've done their own. IOW, the "arch_unmap()" removal was purely preparatory and did nothing on its own, it's only preparatory to get rid of some of the can_modify_mm() costs. The call to can_modify_mm() in mremap_to() is a bit harder to get rid of. Unless we just say "mremap will unmap the destination even if the mremap source is sealed". Linus ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression 2024-08-06 6:04 ` Oliver Sang 2024-08-06 14:38 ` Linus Torvalds @ 2024-08-06 21:37 ` Pedro Falcato 2024-08-07 5:54 ` Oliver Sang 1 sibling, 1 reply; 29+ messages in thread From: Pedro Falcato @ 2024-08-06 21:37 UTC (permalink / raw) To: Oliver Sang Cc: Linus Torvalds, Jeff Xu, Michael Ellerman, Nicholas Piggin, Christophe Leroy, Jeff Xu, oe-lkp, lkp, linux-kernel, Andrew Morton, Kees Cook, Liam R. Howlett, Dave Hansen, Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jonathan Corbet, Jorge Lucangeli Obes, Matthew Wilcox, Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang, feng.tang, fengwei.yin On Tue, Aug 6, 2024 at 7:05 AM Oliver Sang <oliver.sang@intel.com> wrote: > > hi, Linus, > > On Mon, Aug 05, 2024 at 12:33:58PM -0700, Linus Torvalds wrote: > > On Mon, 5 Aug 2024 at 11:55, Linus Torvalds > > <torvalds@linux-foundation.org> wrote: > > > > > > So please consider this a "maybe something like this" patch, but that > > > 'arch_unmap()' really is pretty nasty > > > > Actually, the whole powerpc vdso code confused me. It's not the vvar > > thing that wants this close thing, it's the other ones that have the > > remap thing. > > > > .. and there were two of those error cases that needed to reset the > > vdso pointer. > > > > That all shows just how carefully I was reading this code. > > > > New version - still untested, but now I've read through it one more > > time - attached. > > we tested this version by applying it directly upon 8be7258aad, but seems it > have little impact to performance. still similar regression if comparing to > ff388fe5c4. Hi, I've just sent out a patch set[1] that should alleviate (or hopefully totally fix) these performance regressions. It'd be great if you could test it. For everyone: Apologies if you're in the CC list and I didn't CC you, but I tried to keep my patch set's CC list relatively short and clean (and I focused on the active participants). Everyone's comments are very welcome. [1]: https://lore.kernel.org/all/20240806212808.1885309-1-pedro.falcato@gmail.com/ -- Pedro ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression 2024-08-06 21:37 ` Pedro Falcato @ 2024-08-07 5:54 ` Oliver Sang 0 siblings, 0 replies; 29+ messages in thread From: Oliver Sang @ 2024-08-07 5:54 UTC (permalink / raw) To: Pedro Falcato Cc: Linus Torvalds, Jeff Xu, Michael Ellerman, Nicholas Piggin, Christophe Leroy, Jeff Xu, oe-lkp, lkp, linux-kernel, Andrew Morton, Kees Cook, Liam R. Howlett, Dave Hansen, Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jonathan Corbet, Jorge Lucangeli Obes, Matthew Wilcox, Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang, feng.tang, fengwei.yin, oliver.sang hi, Pedro, On Tue, Aug 06, 2024 at 10:37:08PM +0100, Pedro Falcato wrote: > On Tue, Aug 6, 2024 at 7:05 AM Oliver Sang <oliver.sang@intel.com> wrote: > > > > hi, Linus, > > > > On Mon, Aug 05, 2024 at 12:33:58PM -0700, Linus Torvalds wrote: > > > On Mon, 5 Aug 2024 at 11:55, Linus Torvalds > > > <torvalds@linux-foundation.org> wrote: > > > > > > > > So please consider this a "maybe something like this" patch, but that > > > > 'arch_unmap()' really is pretty nasty > > > > > > Actually, the whole powerpc vdso code confused me. It's not the vvar > > > thing that wants this close thing, it's the other ones that have the > > > remap thing. > > > > > > .. and there were two of those error cases that needed to reset the > > > vdso pointer. > > > > > > That all shows just how carefully I was reading this code. > > > > > > New version - still untested, but now I've read through it one more > > > time - attached. > > > > we tested this version by applying it directly upon 8be7258aad, but seems it > > have little impact to performance. still similar regression if comparing to > > ff388fe5c4. > > Hi, > > I've just sent out a patch set[1] that should alleviate (or hopefully > totally fix) these performance regressions. It'd be great if you could > test it. yes, your patch set totally fixes the regression. our bot automatically fetch the patch set and apply it upon mainline d4560686726f7 as below. d58de4f958df2 (linux-review/Pedro-Falcato/mm-Move-can_modify_vma-to-mm-internal-h/20240807-054658) mm: Remove can_modify_mm() 32668c3efc23f mseal: Replace can_modify_mm_madv with a vma variant 5c3f48cf634c9 mseal: Fix is_madv_discard() 8cde2d71bd0f8 mm/mremap: Replace can_modify_mm with can_modify_vma cc3471461a854 mm/mprotect: Replace can_modify_mm with can_modify_vma abff8a9b6023e mm/munmap: Replace can_modify_mm with can_modify_vma c1bf07aa19804 mm: Move can_modify_vma to mm/internal.h d4560686726f7 (HEAD, linus/master) Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost I tested patch set tip d58de4f958df2 as well as d4560686726f7, below is the results combining with 8be7258aad and its parent. data from 8be7258aad and d4560686726f7 are close enough to within the noise. the patch set tip recover the performance to the level of ff388fe5c4. ========================================================================================= compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: gcc-12/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/pagemove/stress-ng/60s commit: ff388fe5c4 ("mseal: wire up mseal syscall") 8be7258aad ("mseal: add mseal syscall") d456068672 ("Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost") d58de4f958 ("mm: Remove can_modify_mm()") ff388fe5c481d39c 8be7258aad44b5e25977a98db13 d4560686726f7a357922f300fc8 d58de4f958df225c04fd490fe2d ---------------- --------------------------- --------------------------- --------------------------- %stddev %change %stddev %change %stddev %change %stddev \ | \ | \ | \ 44.92 -0.4% 44.76 -5.1% 42.62 -5.7% 42.37 boot-time.boot 33.12 -0.4% 33.00 -7.0% 30.81 -7.0% 30.81 boot-time.dhcp 2631 -0.4% 2620 -5.6% 2483 -6.2% 2468 boot-time.idle 4958 +1.3% 5024 +1.2% 5017 +0.0% 4960 time.percent_of_cpu_this_job_got 2916 +1.5% 2960 +1.4% 2956 +0.1% 2919 time.system_time 65.85 -7.0% 61.27 -6.8% 61.40 -3.4% 63.64 time.user_time 17869 ± 8% -5.6% 16869 ± 28% -24.5% 13488 ± 25% -3.5% 17240 ± 9% numa-vmstat.node0.nr_slab_reclaimable 5182 ± 29% +19.8% 6207 ± 75% +80.1% 9334 ± 36% +7.9% 5591 ± 28% numa-vmstat.node1.nr_slab_reclaimable 10153 ±170% +1041.4% 115893 ±214% +2787.4% 293183 ± 97% +371.7% 47894 ± 90% numa-vmstat.node1.nr_unevictable 10153 ±170% +1041.4% 115893 ±214% +2787.4% 293183 ± 97% +371.7% 47894 ± 90% numa-vmstat.node1.nr_zone_unevictable 71475 ± 8% -5.6% 67478 ± 28% -24.5% 53952 ± 25% -3.5% 68960 ± 9% numa-meminfo.node0.KReclaimable 71475 ± 8% -5.6% 67478 ± 28% -24.5% 53952 ± 25% -3.5% 68960 ± 9% numa-meminfo.node0.SReclaimable 20732 ± 29% +19.8% 24839 ± 75% +80.1% 37346 ± 36% +7.9% 22364 ± 28% numa-meminfo.node1.KReclaimable 20732 ± 29% +19.8% 24839 ± 75% +80.1% 37346 ± 36% +7.9% 22364 ± 28% numa-meminfo.node1.SReclaimable 40615 ±170% +1041.4% 463573 ±214% +2787.4% 1172733 ± 97% +371.7% 191576 ± 90% numa-meminfo.node1.Unevictable 23051 +0.1% 23079 -1.0% 22823 -1.0% 22831 proc-vmstat.nr_slab_reclaimable 41535129 -4.5% 39669773 -4.9% 39501465 -0.3% 41415171 proc-vmstat.numa_hit 41465484 -4.5% 39602956 -4.9% 39434855 -0.3% 41347677 proc-vmstat.numa_local 77303973 -4.6% 73780662 -5.0% 73449965 -0.3% 77049179 proc-vmstat.pgalloc_normal 77022096 -4.6% 73502058 -5.0% 73168463 -0.3% 76769054 proc-vmstat.pgfree 18381956 -4.9% 17473438 -5.1% 17450543 -0.4% 18316849 stress-ng.pagemove.ops 306349 -4.9% 291188 -5.1% 290820 -0.4% 305268 stress-ng.pagemove.ops_per_sec 209930 -6.2% 196996 ± 2% -5.4% 198614 -0.5% 208922 stress-ng.pagemove.page_remaps_per_sec 4958 +1.3% 5024 +1.2% 5017 +0.0% 4960 stress-ng.time.percent_of_cpu_this_job_got 2916 +1.5% 2960 +1.4% 2956 +0.1% 2919 stress-ng.time.system_time 3.337e+10 ± 4% +2.3% 3.414e+10 ± 3% +5.0% 3.503e+10 +1.2% 3.376e+10 perf-stat.i.branch-instructions 1.13 -2.1% 1.10 -2.3% 1.10 +0.1% 1.13 perf-stat.i.cpi 1.695e+11 ± 4% +1.1% 1.715e+11 ± 3% +3.8% 1.761e+11 +1.2% 1.715e+11 perf-stat.i.instructions 0.89 +2.2% 0.91 +2.1% 0.91 -0.4% 0.89 perf-stat.i.ipc 1.04 -7.2% 0.97 -7.2% 0.97 -0.2% 1.04 perf-stat.overall.MPKI 1.13 -2.3% 1.10 -2.1% 1.10 +0.3% 1.13 perf-stat.overall.cpi 1082 +5.4% 1140 +5.5% 1141 +0.5% 1087 perf-stat.overall.cycles-between-cache-misses 0.89 +2.3% 0.91 +2.1% 0.91 -0.3% 0.88 perf-stat.overall.ipc 3.284e+10 ± 4% +2.4% 3.362e+10 ± 2% +4.8% 3.443e+10 +1.1% 3.32e+10 perf-stat.ps.branch-instructions 192.79 -3.9% 185.32 ± 2% -1.7% 189.49 +0.2% 193.10 perf-stat.ps.cpu-migrations 1.669e+11 ± 4% +1.2% 1.689e+11 ± 2% +3.7% 1.731e+11 +1.1% 1.687e+11 perf-stat.ps.instructions 1.048e+13 +2.8% 1.078e+13 +2.1% 1.07e+13 -0.6% 1.042e+13 perf-stat.total.instructions 74.97 -1.9 73.07 -1.7 73.32 +0.4 75.38 perf-profile.calltrace.cycles-pp.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 36.79 -1.6 35.22 -1.4 35.36 +0.3 37.08 perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 24.98 -1.3 23.64 -1.3 23.73 +0.0 24.99 perf-profile.calltrace.cycles-pp.copy_vma.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 19.91 -1.1 18.85 -1.2 18.69 -0.2 19.72 perf-profile.calltrace.cycles-pp.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 10.70 ± 3% -0.9 9.84 ± 3% -0.9 9.78 ± 2% -0.4 10.33 ± 3% perf-profile.calltrace.cycles-pp.kthread.ret_from_fork.ret_from_fork_asm 10.70 ± 3% -0.9 9.84 ± 3% -0.9 9.78 ± 2% -0.4 10.33 ± 3% perf-profile.calltrace.cycles-pp.ret_from_fork.ret_from_fork_asm 10.70 ± 3% -0.9 9.84 ± 3% -0.9 9.78 ± 2% -0.4 10.33 ± 3% perf-profile.calltrace.cycles-pp.ret_from_fork_asm 10.64 ± 3% -0.9 9.79 ± 3% -0.9 9.73 ± 2% -0.4 10.29 ± 3% perf-profile.calltrace.cycles-pp.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm 10.63 ± 3% -0.9 9.78 ± 3% -0.9 9.72 ± 2% -0.4 10.28 ± 3% perf-profile.calltrace.cycles-pp.handle_softirqs.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork 10.63 ± 3% -0.9 9.78 ± 3% -0.9 9.72 ± 2% -0.4 10.28 ± 3% perf-profile.calltrace.cycles-pp.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm 10.63 ± 3% -0.9 9.78 ± 3% -0.9 9.72 ± 2% -0.4 10.28 ± 3% perf-profile.calltrace.cycles-pp.rcu_core.handle_softirqs.run_ksoftirqd.smpboot_thread_fn.kthread 10.59 ± 3% -0.8 9.74 ± 3% -0.9 9.68 ± 2% -0.4 10.24 ± 3% perf-profile.calltrace.cycles-pp.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd.smpboot_thread_fn 14.77 -0.8 14.00 -0.7 14.11 +0.0 14.80 perf-profile.calltrace.cycles-pp.vma_merge.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 1.48 -0.5 0.99 -0.5 0.99 +0.0 1.52 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 5.95 ± 3% -0.5 5.47 ± 3% -0.5 5.44 ± 2% -0.2 5.73 ± 3% perf-profile.calltrace.cycles-pp.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd 7.88 -0.4 7.48 -0.3 7.57 +0.1 7.97 perf-profile.calltrace.cycles-pp.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 4.62 ± 3% -0.4 4.25 ± 3% -0.4 4.20 ± 2% -0.2 4.42 ± 3% perf-profile.calltrace.cycles-pp.__slab_free.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs 6.72 -0.4 6.36 -0.4 6.33 -0.1 6.66 perf-profile.calltrace.cycles-pp.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 6.15 -0.3 5.82 -0.3 5.86 +0.0 6.16 perf-profile.calltrace.cycles-pp.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 6.11 -0.3 5.78 -0.3 5.77 -0.0 6.07 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap 5.78 -0.3 5.49 -0.2 5.57 +0.1 5.85 perf-profile.calltrace.cycles-pp.move_ptes.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64 5.54 -0.3 5.25 -0.3 5.28 +0.0 5.56 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap 5.56 -0.3 5.28 -0.3 5.28 -0.0 5.54 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_merge.copy_vma.move_vma.__do_sys_mremap 5.19 -0.3 4.92 -0.2 4.95 +0.0 5.21 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap 5.20 -0.3 4.94 -0.3 4.95 -0.0 5.18 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma.move_vma 3.20 ± 4% -0.3 2.94 ± 3% -0.3 2.93 ± 2% -0.1 3.11 ± 3% perf-profile.calltrace.cycles-pp.vm_area_free_rcu_cb.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd 4.09 -0.2 3.85 -0.3 3.82 -0.1 4.03 perf-profile.calltrace.cycles-pp.vm_area_dup.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 4.68 -0.2 4.45 -0.2 4.46 -0.0 4.67 perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma 2.63 ± 3% -0.2 2.42 ± 3% -0.2 2.43 ± 2% -0.1 2.57 ± 3% perf-profile.calltrace.cycles-pp.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core.handle_softirqs 2.36 ± 2% -0.2 2.16 ± 4% -0.3 2.04 ± 14% -0.1 2.28 ± 3% perf-profile.calltrace.cycles-pp.__memcpy.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_complete 3.56 -0.2 3.36 -0.2 3.34 -0.0 3.52 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma.__do_sys_mremap 4.00 -0.2 3.81 -0.1 3.87 ± 2% +0.1 4.06 perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma 1.35 -0.2 1.16 -0.2 1.16 +0.0 1.36 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap 3.40 -0.2 3.22 -0.2 3.24 +0.0 3.41 perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma.__do_sys_mremap 2.22 -0.2 2.06 -0.2 2.07 +0.0 2.24 perf-profile.calltrace.cycles-pp.find_vma_prev.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 0.96 -0.2 0.82 -0.2 0.81 +0.0 0.97 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.do_munmap.mremap_to 3.25 -0.1 3.10 -0.1 3.14 +0.0 3.30 perf-profile.calltrace.cycles-pp.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 1.81 ± 4% -0.1 1.67 ± 3% -0.2 1.64 ± 2% -0.1 1.74 ± 3% perf-profile.calltrace.cycles-pp.__slab_free.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core 1.97 ± 3% -0.1 1.83 ± 3% -0.6 1.41 ± 3% -0.5 1.50 ± 2% perf-profile.calltrace.cycles-pp.mod_objcg_state.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.__split_vma 2.26 -0.1 2.12 -0.2 2.05 -0.1 2.16 perf-profile.calltrace.cycles-pp.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 3.10 -0.1 2.96 +0.3 3.38 +0.5 3.60 perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap 3.13 -0.1 2.99 -0.1 3.06 +0.1 3.23 perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma 2.97 -0.1 2.85 -0.2 2.75 ± 2% -0.0 2.94 ± 2% perf-profile.calltrace.cycles-pp.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 2.05 -0.1 1.93 -0.1 1.98 -0.1 1.99 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_merge.copy_vma.move_vma.__do_sys_mremap 8.26 -0.1 8.14 +0.2 8.45 +0.5 8.78 perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 2.45 -0.1 2.34 -0.1 2.34 +0.0 2.46 perf-profile.calltrace.cycles-pp.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma 2.43 -0.1 2.32 -0.0 2.39 +0.1 2.55 perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap 1.75 ± 2% -0.1 1.64 ± 3% -0.1 1.64 ± 4% +0.0 1.77 ± 4% perf-profile.calltrace.cycles-pp.vma_prepare.vma_merge.copy_vma.move_vma.__do_sys_mremap 0.54 -0.1 0.44 ± 37% -0.0 0.51 +0.0 0.55 perf-profile.calltrace.cycles-pp.mas_find.find_vma_prev.copy_vma.move_vma.__do_sys_mremap 1.27 ± 2% -0.1 1.16 ± 4% -0.1 1.14 ± 6% -0.0 1.23 ± 4% perf-profile.calltrace.cycles-pp.__memcpy.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge 1.32 ± 3% -0.1 1.22 ± 3% -0.1 1.20 ± 2% -0.0 1.28 ± 3% perf-profile.calltrace.cycles-pp.rcu_cblist_dequeue.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd 2.21 -0.1 2.11 -0.1 2.11 +0.0 2.23 perf-profile.calltrace.cycles-pp.native_flush_tlb_one_user.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables 1.85 -0.1 1.76 -0.1 1.78 +0.0 1.87 perf-profile.calltrace.cycles-pp.vma_link.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 2.14 ± 2% -0.1 2.05 ± 2% -0.1 2.00 ± 2% +0.0 2.14 ± 2% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap 1.79 ± 2% -0.1 1.70 +0.1 1.93 +0.3 2.06 perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma 1.40 -0.1 1.31 -0.1 1.27 -0.1 1.34 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap 1.39 -0.1 1.30 -0.1 1.34 -0.1 1.33 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma.move_vma 1.24 -0.1 1.16 -0.1 1.13 -0.1 1.19 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap 0.94 -0.1 0.86 -0.1 0.86 +0.0 0.96 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap 1.23 -0.1 1.15 -0.0 1.18 -0.1 1.18 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma 1.54 -0.1 1.46 -0.0 1.50 +0.1 1.60 perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap 0.73 -0.1 0.67 -0.1 0.67 +0.0 0.74 perf-profile.calltrace.cycles-pp.mas_walk.find_vma_prev.copy_vma.move_vma.__do_sys_mremap 1.15 -0.1 1.09 -0.1 1.08 -0.0 1.13 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap 0.60 ± 2% -0.1 0.54 -0.0 0.56 -0.0 0.59 perf-profile.calltrace.cycles-pp.security_mmap_addr.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64 1.27 -0.1 1.21 -0.0 1.22 +0.0 1.30 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma 38.74 -0.1 38.68 +0.1 38.80 +0.3 39.06 perf-profile.calltrace.cycles-pp.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 1.38 ± 4% -0.1 1.32 ± 2% -0.2 1.20 ± 3% -0.1 1.27 ± 2% perf-profile.calltrace.cycles-pp.obj_cgroup_charge.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.__split_vma 0.72 -0.1 0.66 -0.1 0.66 +0.0 0.72 perf-profile.calltrace.cycles-pp.mas_prev.vma_merge.copy_vma.move_vma.__do_sys_mremap 0.70 ± 2% -0.1 0.64 ± 3% +0.1 0.80 ± 3% +0.2 0.85 ± 3% perf-profile.calltrace.cycles-pp.mod_objcg_state.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.copy_vma 0.79 -0.1 0.73 -0.1 0.73 +0.0 0.79 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma 0.80 ± 2% -0.1 0.75 -0.1 0.72 ± 3% -0.0 0.77 ± 2% perf-profile.calltrace.cycles-pp.__call_rcu_common.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge 0.78 -0.1 0.72 -0.0 0.73 +0.0 0.78 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge 1.02 -0.1 0.96 +0.0 1.02 +0.1 1.09 perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region 1.63 -0.1 1.58 -0.1 1.58 +0.0 1.64 perf-profile.calltrace.cycles-pp.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.62 -0.0 0.58 -0.1 0.57 +0.0 0.63 perf-profile.calltrace.cycles-pp.mas_prev_setup.mas_prev.vma_merge.copy_vma.move_vma 0.60 ± 3% -0.0 0.56 ± 3% -0.0 0.59 ± 3% +0.0 0.63 ± 3% perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core 0.67 -0.0 0.62 -0.1 0.59 -0.1 0.61 ± 2% perf-profile.calltrace.cycles-pp.percpu_counter_add_batch.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 0.86 -0.0 0.81 -0.0 0.82 +0.0 0.87 perf-profile.calltrace.cycles-pp.mtree_load.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64 1.02 -0.0 0.97 -0.0 0.98 +0.0 1.04 perf-profile.calltrace.cycles-pp.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.76 ± 2% -0.0 0.71 -0.1 0.71 ± 2% -0.0 0.74 ± 2% perf-profile.calltrace.cycles-pp.allocate_slab.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma 0.81 -0.0 0.77 -0.1 0.76 -0.0 0.81 perf-profile.calltrace.cycles-pp.mtree_load.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 0.70 -0.0 0.66 -0.0 0.66 -0.0 0.69 perf-profile.calltrace.cycles-pp.__call_rcu_common.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 0.67 ± 2% -0.0 0.63 -0.0 0.65 ± 2% +0.0 0.68 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.mremap 0.56 -0.0 0.51 -0.2 0.38 ± 57% +0.0 0.56 perf-profile.calltrace.cycles-pp.mas_walk.mas_prev_setup.mas_prev.vma_merge.copy_vma 0.69 -0.0 0.65 -0.0 0.64 ± 2% -0.0 0.68 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma 0.98 -0.0 0.93 -0.0 0.94 +0.0 0.98 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 0.77 ± 5% -0.0 0.73 ± 2% -0.1 0.66 ± 4% -0.1 0.70 ± 4% perf-profile.calltrace.cycles-pp.obj_cgroup_charge.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.copy_vma 0.78 -0.0 0.74 -0.0 0.75 +0.0 0.79 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_link.copy_vma.move_vma.__do_sys_mremap 1.12 -0.0 1.08 -0.1 1.06 +0.0 1.12 perf-profile.calltrace.cycles-pp.clear_bhb_loop.mremap 0.68 -0.0 0.65 -0.0 0.66 +0.0 0.68 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_link.copy_vma.move_vma.__do_sys_mremap 1.00 -0.0 0.97 -0.0 0.96 +0.0 1.02 perf-profile.calltrace.cycles-pp.mt_find.vma_merge.copy_vma.move_vma.__do_sys_mremap 0.62 -0.0 0.59 -0.0 0.59 -0.0 0.62 perf-profile.calltrace.cycles-pp.mas_prev_slot.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 0.88 -0.0 0.85 -0.0 0.85 +0.0 0.88 perf-profile.calltrace.cycles-pp.userfaultfd_unmap_complete.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 1.15 -0.0 1.12 -0.1 1.08 -0.0 1.13 perf-profile.calltrace.cycles-pp.anon_vma_clone.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 0.60 -0.0 0.57 ± 2% +0.0 0.62 +0.1 0.66 perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas 0.59 -0.0 0.56 -0.0 0.56 -0.0 0.57 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.mremap 0.62 ± 2% -0.0 0.59 ± 2% -0.0 0.59 +0.0 0.63 perf-profile.calltrace.cycles-pp.get_old_pud.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64 0.65 -0.0 0.63 -0.0 0.63 +0.0 0.66 perf-profile.calltrace.cycles-pp.mas_update_gap.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma 0.55 -0.0 0.53 +0.0 0.58 +0.1 0.61 perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.move_ptes.move_page_tables.move_vma.__do_sys_mremap 0.74 -0.0 0.72 -0.1 0.68 ± 2% -0.0 0.71 ± 2% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.anon_vma_clone.copy_vma.move_vma.__do_sys_mremap 0.67 +0.1 0.74 +0.1 0.73 +0.0 0.68 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__madvise 0.76 +0.1 0.84 +0.1 0.82 +0.0 0.78 perf-profile.calltrace.cycles-pp.__madvise 0.66 +0.1 0.74 +0.1 0.73 +0.0 0.67 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise 0.63 +0.1 0.71 +0.1 0.70 +0.0 0.64 perf-profile.calltrace.cycles-pp.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise 0.62 +0.1 0.70 +0.1 0.69 +0.0 0.64 perf-profile.calltrace.cycles-pp.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise 3.47 +0.1 3.55 +0.4 3.89 +0.5 3.95 perf-profile.calltrace.cycles-pp.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma 87.67 +0.8 88.47 +0.9 88.53 +0.3 88.01 perf-profile.calltrace.cycles-pp.mremap 0.00 +0.9 0.86 +0.8 0.84 +0.0 0.00 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.do_munmap 0.00 +0.9 0.88 +0.9 0.86 +0.0 0.00 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.mremap_to.__do_sys_mremap 0.00 +0.9 0.90 ± 2% +0.9 0.90 +0.0 0.00 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.move_vma 84.82 +1.0 85.80 +1.0 85.84 +0.4 85.19 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.mremap 84.66 +1.0 85.65 +1.0 85.69 +0.4 85.04 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 83.71 +1.0 84.73 +1.2 84.89 +0.5 84.18 perf-profile.calltrace.cycles-pp.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 0.00 +1.1 1.10 +1.1 1.08 +0.0 0.00 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64 0.00 +1.2 1.21 +1.2 1.20 +0.0 0.00 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to 2.09 +1.5 3.60 +1.5 3.59 +0.0 2.11 perf-profile.calltrace.cycles-pp.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.00 +1.5 1.51 +1.5 1.50 +0.0 0.00 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap 1.59 +1.5 3.12 +1.5 3.11 +0.0 1.60 perf-profile.calltrace.cycles-pp.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64 0.00 +1.6 1.62 +1.6 1.59 +0.0 0.00 perf-profile.calltrace.cycles-pp.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.00 +1.7 1.72 +1.7 1.72 +0.0 0.00 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap 0.00 +2.0 2.01 +2.0 1.99 +0.0 0.00 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 5.34 +3.0 8.38 +3.0 8.34 +0.1 5.41 perf-profile.calltrace.cycles-pp.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 75.13 -1.9 73.22 -1.7 73.47 +0.4 75.55 perf-profile.children.cycles-pp.move_vma 37.01 -1.6 35.43 -1.4 35.56 +0.3 37.30 perf-profile.children.cycles-pp.do_vmi_align_munmap 25.06 -1.3 23.71 -1.3 23.80 +0.0 25.06 perf-profile.children.cycles-pp.copy_vma 20.00 -1.1 18.94 -1.2 18.77 -0.2 19.81 perf-profile.children.cycles-pp.__split_vma 19.86 -1.0 18.87 -0.9 18.92 -0.0 19.84 perf-profile.children.cycles-pp.rcu_core 19.84 -1.0 18.85 -0.9 18.90 -0.0 19.82 perf-profile.children.cycles-pp.rcu_do_batch 19.88 -1.0 18.89 -0.9 18.94 -0.0 19.86 perf-profile.children.cycles-pp.handle_softirqs 10.70 ± 3% -0.9 9.84 ± 3% -0.9 9.78 ± 2% -0.4 10.33 ± 3% perf-profile.children.cycles-pp.kthread 10.70 ± 3% -0.9 9.84 ± 3% -0.9 9.78 ± 2% -0.4 10.34 ± 3% perf-profile.children.cycles-pp.ret_from_fork 10.70 ± 3% -0.9 9.84 ± 3% -0.9 9.78 ± 2% -0.4 10.34 ± 3% perf-profile.children.cycles-pp.ret_from_fork_asm 10.64 ± 3% -0.9 9.79 ± 3% -0.9 9.73 ± 2% -0.4 10.29 ± 3% perf-profile.children.cycles-pp.smpboot_thread_fn 10.63 ± 3% -0.9 9.78 ± 3% -0.9 9.72 ± 2% -0.4 10.28 ± 3% perf-profile.children.cycles-pp.run_ksoftirqd 17.53 -0.8 16.70 -0.8 16.76 +0.0 17.54 perf-profile.children.cycles-pp.kmem_cache_free 15.28 -0.8 14.47 -1.0 14.33 -0.2 15.04 perf-profile.children.cycles-pp.kmem_cache_alloc_noprof 15.16 -0.8 14.37 -0.7 14.48 +0.0 15.20 perf-profile.children.cycles-pp.vma_merge 12.18 -0.6 11.54 -0.6 11.60 +0.0 12.20 perf-profile.children.cycles-pp.mas_wr_store_entry 11.98 -0.6 11.36 -0.6 11.41 +0.0 11.98 perf-profile.children.cycles-pp.mas_store_prealloc 12.11 -0.6 11.51 -0.6 11.50 -0.1 12.02 perf-profile.children.cycles-pp.__slab_free 10.86 -0.6 10.26 -0.7 10.21 -0.1 10.75 perf-profile.children.cycles-pp.vm_area_dup 9.89 -0.5 9.40 -0.5 9.44 +0.0 9.93 perf-profile.children.cycles-pp.mas_wr_node_store 8.36 -0.4 7.92 -0.4 7.97 +0.1 8.49 perf-profile.children.cycles-pp.__memcg_slab_post_alloc_hook 7.98 -0.4 7.58 -0.3 7.68 +0.1 8.08 perf-profile.children.cycles-pp.move_page_tables 6.69 -0.4 6.33 -0.3 6.39 +0.0 6.72 perf-profile.children.cycles-pp.vma_complete 5.86 -0.3 5.56 -0.2 5.64 +0.1 5.93 perf-profile.children.cycles-pp.move_ptes 5.11 -0.3 4.81 -0.3 4.80 -0.2 4.95 perf-profile.children.cycles-pp.mas_preallocate 6.05 -0.3 5.75 -0.3 5.77 +0.0 6.07 perf-profile.children.cycles-pp.vm_area_free_rcu_cb 2.98 ± 2% -0.3 2.73 ± 4% -0.3 2.66 ± 6% -0.1 2.88 ± 3% perf-profile.children.cycles-pp.__memcpy 3.48 -0.2 3.26 -0.2 3.25 -0.0 3.45 perf-profile.children.cycles-pp.___slab_alloc 3.46 ± 2% -0.2 3.26 +0.3 3.71 ± 2% +0.5 3.92 ± 2% perf-profile.children.cycles-pp.mod_objcg_state 2.91 -0.2 2.73 -0.2 2.73 -0.1 2.79 perf-profile.children.cycles-pp.mas_alloc_nodes 2.43 -0.2 2.25 -0.2 2.27 +0.0 2.45 perf-profile.children.cycles-pp.find_vma_prev 3.47 -0.2 3.29 -0.2 3.27 ± 2% +0.0 3.50 ± 2% perf-profile.children.cycles-pp.down_write 3.46 -0.2 3.28 -0.2 3.30 +0.0 3.46 perf-profile.children.cycles-pp.flush_tlb_mm_range 4.22 -0.2 4.06 -0.3 3.91 -0.1 4.16 perf-profile.children.cycles-pp.anon_vma_clone 3.32 -0.2 3.17 -0.1 3.25 +0.1 3.42 perf-profile.children.cycles-pp.__memcg_slab_free_hook 3.35 -0.2 3.20 -0.1 3.24 +0.0 3.40 perf-profile.children.cycles-pp.mas_store_gfp 2.22 -0.1 2.07 -0.1 2.12 +0.0 2.24 perf-profile.children.cycles-pp.__cond_resched 2.05 ± 2% -0.1 1.91 -0.1 1.92 -0.0 2.04 perf-profile.children.cycles-pp.allocate_slab 3.18 -0.1 3.04 -0.1 3.11 +0.1 3.28 perf-profile.children.cycles-pp.unmap_vmas 2.24 -0.1 2.11 ± 2% -0.1 2.10 ± 3% +0.0 2.25 ± 3% perf-profile.children.cycles-pp.vma_prepare 2.12 -0.1 2.00 -0.2 1.95 -0.0 2.08 perf-profile.children.cycles-pp.__call_rcu_common 2.66 -0.1 2.53 -0.1 2.53 +0.0 2.68 perf-profile.children.cycles-pp.mtree_load 2.46 -0.1 2.34 -0.1 2.34 +0.0 2.47 perf-profile.children.cycles-pp.rcu_cblist_dequeue 2.45 ± 4% -0.1 2.33 ± 2% -0.3 2.15 ± 3% -0.2 2.28 ± 2% perf-profile.children.cycles-pp.obj_cgroup_charge 2.49 -0.1 2.38 -0.1 2.39 +0.0 2.51 perf-profile.children.cycles-pp.flush_tlb_func 8.32 -0.1 8.21 +0.2 8.52 +0.5 8.85 perf-profile.children.cycles-pp.unmap_region 2.48 -0.1 2.37 -0.0 2.44 +0.1 2.59 perf-profile.children.cycles-pp.unmap_page_range 2.23 -0.1 2.13 -0.1 2.12 +0.0 2.24 perf-profile.children.cycles-pp.native_flush_tlb_one_user 1.77 -0.1 1.67 -0.1 1.68 -0.0 1.76 perf-profile.children.cycles-pp.mas_wr_walk 1.88 -0.1 1.78 -0.1 1.80 +0.0 1.89 perf-profile.children.cycles-pp.vma_link 1.40 -0.1 1.31 -0.1 1.32 -0.0 1.40 ± 2% perf-profile.children.cycles-pp.shuffle_freelist 1.84 -0.1 1.75 -0.1 1.75 +0.0 1.85 perf-profile.children.cycles-pp.up_write 0.97 ± 2% -0.1 0.88 -0.1 0.90 ± 2% -0.0 0.94 ± 2% perf-profile.children.cycles-pp.rcu_all_qs 1.03 -0.1 0.95 -0.1 0.94 +0.0 1.04 perf-profile.children.cycles-pp.mas_prev 0.92 -0.1 0.85 -0.1 0.84 -0.0 0.92 perf-profile.children.cycles-pp.mas_prev_setup 1.58 -0.1 1.50 -0.0 1.54 +0.1 1.64 perf-profile.children.cycles-pp.zap_pmd_range 1.24 -0.1 1.17 -0.1 1.18 -0.0 1.24 perf-profile.children.cycles-pp.mas_prev_slot 1.58 -0.1 1.51 -0.1 1.52 +0.0 1.59 perf-profile.children.cycles-pp.mas_update_gap 0.62 -0.1 0.56 -0.0 0.58 -0.0 0.62 perf-profile.children.cycles-pp.security_mmap_addr 0.49 ± 2% -0.1 0.43 -0.0 0.44 ± 2% -0.0 0.46 ± 3% perf-profile.children.cycles-pp.setup_object 0.90 -0.1 0.84 -0.1 0.75 -0.1 0.78 perf-profile.children.cycles-pp.percpu_counter_add_batch 0.98 -0.1 0.92 -0.0 0.97 +0.0 1.02 perf-profile.children.cycles-pp.mas_pop_node 0.85 -0.1 0.80 -0.1 0.78 -0.0 0.84 perf-profile.children.cycles-pp._raw_spin_lock_irqsave 1.68 -0.1 1.62 -0.1 1.62 +0.0 1.68 perf-profile.children.cycles-pp.__get_unmapped_area 1.23 -0.1 1.18 +0.0 1.27 +0.1 1.34 perf-profile.children.cycles-pp.__pte_offset_map_lock 1.08 -0.1 1.03 -0.0 1.08 +0.1 1.14 perf-profile.children.cycles-pp.zap_pte_range 0.69 ± 2% -0.0 0.64 -0.0 0.67 ± 2% +0.0 0.70 perf-profile.children.cycles-pp.syscall_return_via_sysret 1.04 -0.0 1.00 -0.0 1.00 +0.0 1.08 perf-profile.children.cycles-pp.vma_to_resize 1.08 -0.0 1.04 -0.0 1.04 +0.0 1.10 perf-profile.children.cycles-pp.mas_leaf_max_gap 0.51 ± 3% -0.0 0.47 -0.0 0.47 -0.0 0.51 perf-profile.children.cycles-pp.anon_vma_interval_tree_insert 1.18 -0.0 1.14 -0.1 1.12 +0.0 1.18 perf-profile.children.cycles-pp.clear_bhb_loop 0.57 -0.0 0.53 -0.0 0.52 ± 2% -0.0 0.54 perf-profile.children.cycles-pp.mas_wr_end_piv 0.43 -0.0 0.40 -0.1 0.38 -0.0 0.41 ± 3% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath 1.14 -0.0 1.10 -0.0 1.09 +0.0 1.15 perf-profile.children.cycles-pp.mt_find 0.62 -0.0 0.58 -0.0 0.58 -0.0 0.61 perf-profile.children.cycles-pp.__put_partials 0.46 ± 7% -0.0 0.42 ± 2% -0.0 0.43 -0.0 0.45 perf-profile.children.cycles-pp._raw_spin_lock 0.90 -0.0 0.87 -0.0 0.88 +0.0 0.90 perf-profile.children.cycles-pp.userfaultfd_unmap_complete 0.46 ± 3% -0.0 0.42 ± 3% -0.0 0.42 ± 2% -0.0 0.45 ± 2% perf-profile.children.cycles-pp.__alloc_pages_noprof 0.61 -0.0 0.58 -0.0 0.58 -0.0 0.60 perf-profile.children.cycles-pp.entry_SYSCALL_64 0.44 ± 3% -0.0 0.40 ± 3% -0.0 0.40 ± 2% -0.0 0.43 ± 2% perf-profile.children.cycles-pp.get_page_from_freelist 0.48 -0.0 0.45 ± 2% -0.0 0.45 -0.0 0.46 perf-profile.children.cycles-pp.mas_prev_range 0.64 -0.0 0.61 -0.0 0.61 +0.0 0.65 perf-profile.children.cycles-pp.get_old_pud 0.31 ± 2% -0.0 0.28 ± 3% -0.0 0.29 ± 2% +0.0 0.32 ± 3% perf-profile.children.cycles-pp.security_vm_enough_memory_mm 0.33 ± 3% -0.0 0.30 ± 2% -0.0 0.30 ± 2% -0.0 0.32 ± 2% perf-profile.children.cycles-pp.mas_put_in_tree 0.32 ± 2% -0.0 0.29 ± 2% -0.0 0.30 ± 3% -0.0 0.31 ± 2% perf-profile.children.cycles-pp.tlb_finish_mmu 0.47 -0.0 0.44 ± 2% -0.0 0.42 ± 2% -0.0 0.45 perf-profile.children.cycles-pp.rcu_segcblist_enqueue 0.70 ± 3% -0.0 0.68 -0.0 0.66 ± 3% -0.1 0.60 perf-profile.children.cycles-pp.__anon_vma_interval_tree_remove 0.32 ± 3% -0.0 0.30 ± 2% -0.0 0.30 -0.0 0.32 perf-profile.children.cycles-pp.free_unref_page 0.55 -0.0 0.53 -0.0 0.55 ± 2% +0.0 0.58 perf-profile.children.cycles-pp.refill_obj_stock 0.33 -0.0 0.31 -0.0 0.32 +0.0 0.33 perf-profile.children.cycles-pp.mas_destroy 0.25 ± 4% -0.0 0.23 ± 3% -0.0 0.23 ± 3% -0.0 0.25 ± 2% perf-profile.children.cycles-pp.rmqueue 0.35 -0.0 0.34 -0.0 0.34 +0.0 0.36 perf-profile.children.cycles-pp.__rb_insert_augmented 0.39 -0.0 0.37 -0.0 0.36 ± 2% -0.0 0.38 perf-profile.children.cycles-pp.down_write_killable 0.22 ± 4% -0.0 0.20 ± 3% -0.0 0.20 ± 3% -0.0 0.22 ± 3% perf-profile.children.cycles-pp.__rmqueue_pcplist 0.21 ± 4% -0.0 0.19 ± 3% -0.0 0.19 ± 3% -0.0 0.20 ± 3% perf-profile.children.cycles-pp.rmqueue_bulk 0.52 -0.0 0.51 ± 2% +0.1 0.59 +0.1 0.64 perf-profile.children.cycles-pp.__pte_offset_map 0.30 ± 2% -0.0 0.28 ± 2% -0.1 0.23 ± 3% -0.0 0.25 ± 3% perf-profile.children.cycles-pp.__vm_enough_memory 0.26 -0.0 0.24 ± 2% -0.0 0.21 -0.0 0.22 perf-profile.children.cycles-pp.syscall_exit_to_user_mode 0.28 ± 2% -0.0 0.27 ± 2% -0.0 0.26 -0.0 0.28 ± 2% perf-profile.children.cycles-pp.free_unref_page_commit 0.29 -0.0 0.27 -0.0 0.27 ± 2% +0.0 0.29 ± 2% perf-profile.children.cycles-pp.tlb_gather_mmu 0.16 ± 2% -0.0 0.14 ± 3% -0.0 0.14 ± 2% -0.0 0.14 ± 3% perf-profile.children.cycles-pp.mas_wr_append 0.28 ± 2% -0.0 0.26 +0.0 0.32 +0.1 0.33 ± 2% perf-profile.children.cycles-pp.khugepaged_enter_vma 0.32 -0.0 0.30 -0.0 0.30 -0.0 0.32 ± 2% perf-profile.children.cycles-pp.mas_wr_store_setup 0.09 ± 4% -0.0 0.08 ± 5% -0.0 0.06 ± 6% -0.0 0.07 perf-profile.children.cycles-pp.vma_dup_policy 0.43 -0.0 0.42 -0.0 0.41 +0.0 0.43 perf-profile.children.cycles-pp.mremap_userfaultfd_complete 0.13 ± 6% -0.0 0.12 ± 11% -0.0 0.10 ± 4% +0.0 0.13 ± 9% perf-profile.children.cycles-pp.vm_stat_account 0.36 -0.0 0.35 -0.0 0.35 +0.0 0.37 perf-profile.children.cycles-pp.madvise_vma_behavior 0.18 ± 2% -0.0 0.17 ± 2% -0.0 0.16 ± 2% +0.0 0.18 ± 2% perf-profile.children.cycles-pp.__free_one_page 0.16 ± 3% -0.0 0.15 ± 3% -0.0 0.12 -0.0 0.13 ± 3% perf-profile.children.cycles-pp.x64_sys_call 0.15 ± 3% -0.0 0.14 ± 3% -0.0 0.13 ± 2% -0.0 0.14 ± 2% perf-profile.children.cycles-pp.flush_tlb_batched_pending 0.15 ± 2% -0.0 0.14 ± 3% +0.0 0.19 ± 2% +0.1 0.20 ± 2% perf-profile.children.cycles-pp.mas_node_count_gfp 0.24 ± 2% +0.0 0.24 ± 3% +0.0 0.24 ± 2% +0.0 0.27 ± 6% perf-profile.children.cycles-pp.lru_add_drain 0.07 +0.0 0.07 ± 6% -0.0 0.05 -0.0 0.05 ± 9% perf-profile.children.cycles-pp.__x64_sys_mremap 0.14 ± 3% +0.0 0.15 ± 2% +0.0 0.14 ± 5% +0.0 0.14 ± 2% perf-profile.children.cycles-pp.free_pgd_range 0.08 ± 4% +0.0 0.10 ± 4% +0.0 0.08 +0.0 0.08 perf-profile.children.cycles-pp.mm_get_unmapped_area_vmflags 0.78 +0.1 0.85 +0.1 0.84 +0.0 0.79 perf-profile.children.cycles-pp.__madvise 0.63 +0.1 0.71 +0.1 0.70 +0.0 0.64 perf-profile.children.cycles-pp.__x64_sys_madvise 0.63 +0.1 0.70 +0.1 0.70 +0.0 0.64 perf-profile.children.cycles-pp.do_madvise 3.52 +0.1 3.60 +0.4 3.97 +0.5 4.03 perf-profile.children.cycles-pp.free_pgtables 0.00 +0.1 0.09 +0.1 0.09 ± 3% +0.0 0.00 perf-profile.children.cycles-pp.can_modify_mm_madv 1.30 +0.2 1.46 +0.2 1.48 +0.0 1.32 perf-profile.children.cycles-pp.mas_next_slot 88.06 +0.8 88.84 +0.9 88.91 +0.3 88.40 perf-profile.children.cycles-pp.mremap 83.81 +1.0 84.84 +1.2 84.99 +0.5 84.28 perf-profile.children.cycles-pp.__do_sys_mremap 85.98 +1.0 87.02 +1.1 87.07 +0.4 86.38 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe 85.50 +1.1 86.56 +1.1 86.60 +0.4 85.89 perf-profile.children.cycles-pp.do_syscall_64 2.12 +1.5 3.62 +1.5 3.61 +0.0 2.13 perf-profile.children.cycles-pp.do_munmap 40.41 +1.5 41.93 +1.6 42.04 +0.3 40.75 perf-profile.children.cycles-pp.do_vmi_munmap 3.62 +2.4 5.98 +2.3 5.93 +0.0 3.65 perf-profile.children.cycles-pp.mas_walk 5.40 +3.0 8.44 +3.0 8.41 +0.1 5.47 perf-profile.children.cycles-pp.mremap_to 5.26 +3.2 8.48 +3.2 8.44 +0.1 5.31 perf-profile.children.cycles-pp.mas_find 0.00 +5.5 5.46 +5.4 5.42 +0.0 0.00 perf-profile.children.cycles-pp.can_modify_mm 11.49 -0.6 10.92 -0.6 10.92 -0.1 11.41 perf-profile.self.cycles-pp.__slab_free 4.32 -0.2 4.07 -1.1 3.26 ± 2% -0.9 3.46 perf-profile.self.cycles-pp.__memcg_slab_post_alloc_hook 1.96 -0.2 1.80 ± 4% -0.2 1.75 ± 6% -0.1 1.89 ± 3% perf-profile.self.cycles-pp.__memcpy 2.36 ± 2% -0.1 2.24 ± 2% -0.1 2.22 ± 3% +0.0 2.38 ± 2% perf-profile.self.cycles-pp.down_write 2.42 -0.1 2.30 -0.1 2.31 +0.0 2.44 perf-profile.self.cycles-pp.rcu_cblist_dequeue 2.33 -0.1 2.22 -0.1 2.21 -0.0 2.32 perf-profile.self.cycles-pp.mtree_load 2.21 -0.1 2.10 -0.1 2.10 +0.0 2.22 perf-profile.self.cycles-pp.native_flush_tlb_one_user 2.04 ± 5% -0.1 1.95 ± 3% -0.2 1.80 ± 3% -0.1 1.90 ± 3% perf-profile.self.cycles-pp.obj_cgroup_charge 1.62 -0.1 1.54 -0.1 1.55 +0.0 1.63 ± 2% perf-profile.self.cycles-pp.__memcg_slab_free_hook 1.52 -0.1 1.44 -0.1 1.45 -0.0 1.50 perf-profile.self.cycles-pp.mas_wr_walk 1.15 ± 2% -0.1 1.07 -0.1 1.08 -0.0 1.14 ± 2% perf-profile.self.cycles-pp.shuffle_freelist 1.53 -0.1 1.45 -0.1 1.46 +0.0 1.53 perf-profile.self.cycles-pp.up_write 1.44 -0.1 1.36 -0.1 1.33 -0.0 1.41 perf-profile.self.cycles-pp.__call_rcu_common 0.70 ± 2% -0.1 0.62 -0.1 0.64 ± 3% -0.0 0.67 ± 2% perf-profile.self.cycles-pp.rcu_all_qs 1.72 -0.1 1.66 +1.0 2.68 ± 2% +1.1 2.84 perf-profile.self.cycles-pp.mod_objcg_state 0.51 ± 3% -0.1 0.45 -0.0 0.47 -0.0 0.50 perf-profile.self.cycles-pp.security_mmap_addr 2.52 -0.1 2.46 -0.2 2.36 -0.2 2.33 perf-profile.self.cycles-pp.kmem_cache_alloc_noprof 0.94 ± 2% -0.1 0.88 ± 4% -0.1 0.88 ± 3% -0.0 0.92 ± 5% perf-profile.self.cycles-pp.vm_area_dup 1.18 -0.1 1.12 -0.1 1.12 -0.0 1.18 perf-profile.self.cycles-pp.vma_merge 0.89 -0.1 0.83 -0.1 0.83 -0.0 0.88 perf-profile.self.cycles-pp.___slab_alloc 1.38 -0.1 1.33 -0.0 1.34 +0.0 1.39 perf-profile.self.cycles-pp.do_vmi_align_munmap 0.62 -0.1 0.56 ± 2% -0.1 0.56 -0.0 0.59 perf-profile.self.cycles-pp.mremap 1.00 -0.1 0.95 -0.1 0.94 -0.0 0.97 perf-profile.self.cycles-pp.mas_preallocate 0.98 -0.1 0.93 -0.0 0.94 -0.0 0.98 perf-profile.self.cycles-pp.move_ptes 0.99 -0.1 0.94 -0.0 0.94 -0.0 0.99 perf-profile.self.cycles-pp.mas_prev_slot 1.09 -0.0 1.04 ± 2% -0.0 1.07 +0.0 1.14 perf-profile.self.cycles-pp.__cond_resched 0.94 -0.0 0.90 -0.1 0.88 -0.0 0.94 perf-profile.self.cycles-pp.vm_area_free_rcu_cb 0.85 -0.0 0.80 -0.0 0.84 +0.0 0.88 perf-profile.self.cycles-pp.mas_pop_node 0.77 -0.0 0.72 -0.1 0.64 -0.1 0.66 perf-profile.self.cycles-pp.percpu_counter_add_batch 0.68 -0.0 0.63 -0.1 0.62 -0.0 0.66 perf-profile.self.cycles-pp.__split_vma 1.17 -0.0 1.13 -0.1 1.11 +0.0 1.17 perf-profile.self.cycles-pp.clear_bhb_loop 0.95 -0.0 0.91 -0.0 0.91 +0.0 0.95 perf-profile.self.cycles-pp.mas_leaf_max_gap 0.79 -0.0 0.75 -0.0 0.77 +0.0 0.80 perf-profile.self.cycles-pp.mas_wr_store_entry 0.44 -0.0 0.40 -0.0 0.41 +0.0 0.44 perf-profile.self.cycles-pp.do_munmap 1.22 -0.0 1.18 -0.0 1.19 +0.0 1.22 perf-profile.self.cycles-pp.move_vma 0.45 -0.0 0.42 -0.0 0.41 -0.0 0.43 perf-profile.self.cycles-pp.mas_wr_end_piv 0.89 -0.0 0.86 -0.0 0.87 +0.0 0.90 perf-profile.self.cycles-pp.mas_store_gfp 0.43 ± 2% -0.0 0.40 -0.1 0.38 -0.0 0.41 ± 3% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath 0.78 -0.0 0.75 -0.0 0.76 +0.0 0.79 perf-profile.self.cycles-pp.userfaultfd_unmap_complete 0.66 -0.0 0.63 -0.0 0.63 -0.0 0.66 perf-profile.self.cycles-pp.mas_store_prealloc 1.49 -0.0 1.46 -0.0 1.45 ± 2% +0.0 1.50 perf-profile.self.cycles-pp.kmem_cache_free 0.60 -0.0 0.58 -0.0 0.58 +0.0 0.61 perf-profile.self.cycles-pp.unmap_region 0.86 -0.0 0.83 -0.0 0.84 +0.0 0.88 perf-profile.self.cycles-pp.move_page_tables 0.43 ± 4% -0.0 0.40 -0.0 0.40 -0.0 0.42 perf-profile.self.cycles-pp.anon_vma_interval_tree_insert 0.99 -0.0 0.97 -0.0 0.95 +0.0 1.00 perf-profile.self.cycles-pp.mt_find 0.71 -0.0 0.68 -0.0 0.67 -0.0 0.69 perf-profile.self.cycles-pp.unmap_page_range 0.36 ± 3% -0.0 0.33 ± 2% -0.0 0.34 ± 3% +0.0 0.36 ± 2% perf-profile.self.cycles-pp.syscall_return_via_sysret 0.55 -0.0 0.52 -0.0 0.52 +0.0 0.55 perf-profile.self.cycles-pp.get_old_pud 0.49 -0.0 0.47 -0.0 0.47 +0.0 0.49 perf-profile.self.cycles-pp.find_vma_prev 0.27 -0.0 0.25 -0.0 0.25 -0.0 0.26 ± 2% perf-profile.self.cycles-pp.mas_prev_setup 0.41 -0.0 0.39 -0.0 0.39 +0.0 0.42 perf-profile.self.cycles-pp._raw_spin_lock_irqsave 0.61 -0.0 0.58 -0.0 0.59 +0.0 0.62 perf-profile.self.cycles-pp.copy_vma 0.37 ± 6% -0.0 0.35 ± 2% -0.0 0.36 -0.0 0.37 perf-profile.self.cycles-pp._raw_spin_lock 0.47 -0.0 0.45 ± 2% -0.0 0.46 -0.0 0.47 perf-profile.self.cycles-pp.flush_tlb_mm_range 0.42 ± 2% -0.0 0.40 ± 2% -0.0 0.38 ± 2% -0.0 0.41 perf-profile.self.cycles-pp.rcu_segcblist_enqueue 0.27 -0.0 0.25 ± 2% -0.0 0.24 ± 2% -0.0 0.26 ± 2% perf-profile.self.cycles-pp.mas_put_in_tree 0.44 -0.0 0.42 -0.0 0.42 +0.0 0.44 perf-profile.self.cycles-pp.mas_update_gap 0.39 -0.0 0.37 -0.0 0.38 -0.0 0.39 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack 0.49 -0.0 0.47 +0.0 0.50 ± 2% +0.0 0.52 perf-profile.self.cycles-pp.refill_obj_stock 0.27 ± 2% -0.0 0.25 ± 2% -0.0 0.26 -0.0 0.27 perf-profile.self.cycles-pp.tlb_finish_mmu 0.34 -0.0 0.32 -0.0 0.32 -0.0 0.33 perf-profile.self.cycles-pp.zap_pmd_range 0.48 -0.0 0.46 -0.0 0.48 +0.0 0.49 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe 0.58 ± 2% -0.0 0.56 -0.0 0.54 ± 3% -0.1 0.48 perf-profile.self.cycles-pp.__anon_vma_interval_tree_remove 0.28 -0.0 0.26 -0.0 0.27 +0.0 0.28 ± 2% perf-profile.self.cycles-pp.mas_alloc_nodes 0.24 ± 2% -0.0 0.22 -0.0 0.22 +0.0 0.24 ± 2% perf-profile.self.cycles-pp.mas_prev 0.14 ± 3% -0.0 0.12 ± 2% -0.0 0.12 -0.0 0.12 perf-profile.self.cycles-pp.syscall_exit_to_user_mode 0.52 -0.0 0.51 -0.0 0.51 +0.0 0.55 perf-profile.self.cycles-pp.mremap_to 0.26 -0.0 0.24 -0.0 0.24 -0.0 0.26 perf-profile.self.cycles-pp.__rb_insert_augmented 0.40 -0.0 0.39 -0.0 0.39 +0.0 0.41 ± 2% perf-profile.self.cycles-pp.__pte_offset_map_lock 0.38 -0.0 0.37 -0.0 0.36 -0.0 0.38 perf-profile.self.cycles-pp.mremap_userfaultfd_complete 0.28 -0.0 0.26 ± 3% -0.0 0.26 -0.0 0.27 ± 2% perf-profile.self.cycles-pp.mas_prev_range 0.33 ± 2% -0.0 0.32 -0.0 0.31 -0.0 0.33 ± 2% perf-profile.self.cycles-pp.zap_pte_range 0.28 -0.0 0.26 -0.0 0.27 +0.0 0.28 perf-profile.self.cycles-pp.flush_tlb_func 0.22 -0.0 0.21 ± 2% -0.0 0.20 ± 2% -0.0 0.21 perf-profile.self.cycles-pp.entry_SYSCALL_64 0.10 -0.0 0.09 -0.0 0.09 ± 3% -0.0 0.10 ± 3% perf-profile.self.cycles-pp.mod_node_page_state 0.17 -0.0 0.16 -0.0 0.17 ± 2% +0.0 0.17 perf-profile.self.cycles-pp.__thp_vma_allowable_orders 0.44 -0.0 0.42 ± 2% +0.1 0.50 +0.1 0.54 perf-profile.self.cycles-pp.__pte_offset_map 0.06 -0.0 0.05 -0.1 0.00 -0.0 0.02 ±129% perf-profile.self.cycles-pp.vma_dup_policy 0.13 ± 3% -0.0 0.12 ± 3% -0.0 0.09 -0.0 0.09 ± 5% perf-profile.self.cycles-pp.x64_sys_call 0.31 -0.0 0.30 -0.0 0.29 -0.0 0.29 perf-profile.self.cycles-pp.unmap_vmas 0.10 ± 10% -0.0 0.09 ± 12% -0.0 0.08 ± 5% +0.0 0.10 ± 12% perf-profile.self.cycles-pp.vm_stat_account 0.08 ± 5% -0.0 0.07 ± 4% +0.0 0.11 ± 3% +0.0 0.12 ± 3% perf-profile.self.cycles-pp.mas_node_count_gfp 0.22 -0.0 0.21 ± 2% -0.0 0.20 -0.0 0.21 ± 2% perf-profile.self.cycles-pp.do_syscall_64 0.11 -0.0 0.10 ± 4% -0.0 0.10 +0.0 0.11 perf-profile.self.cycles-pp.security_vm_enough_memory_mm 0.08 -0.0 0.08 ± 5% -0.0 0.08 ± 4% +0.0 0.09 perf-profile.self.cycles-pp.__vm_enough_memory 0.07 +0.0 0.07 +0.0 0.08 +0.0 0.09 ± 3% perf-profile.self.cycles-pp.khugepaged_enter_vma 0.15 ± 3% +0.0 0.16 ± 3% +0.0 0.16 ± 3% +0.0 0.17 ± 2% perf-profile.self.cycles-pp.vma_to_resize 0.56 +0.0 0.57 -0.0 0.53 -0.0 0.53 perf-profile.self.cycles-pp.__do_sys_mremap 0.06 ± 5% +0.0 0.07 +0.0 0.06 +0.0 0.06 perf-profile.self.cycles-pp.mm_get_unmapped_area_vmflags 0.11 ± 4% +0.0 0.12 ± 4% -0.0 0.11 ± 3% +0.0 0.12 ± 3% perf-profile.self.cycles-pp.free_pgd_range 0.21 +0.0 0.22 ± 2% -0.0 0.21 ± 2% +0.0 0.22 ± 2% perf-profile.self.cycles-pp.thp_get_unmapped_area_vmflags 0.45 +0.0 0.48 +0.0 0.48 -0.0 0.44 perf-profile.self.cycles-pp.do_vmi_munmap 0.27 +0.0 0.32 +0.3 0.60 +0.4 0.62 perf-profile.self.cycles-pp.free_pgtables 0.36 ± 2% +0.1 0.44 +0.0 0.37 ± 2% -0.0 0.35 perf-profile.self.cycles-pp.unlink_anon_vmas 1.06 +0.1 1.19 +0.1 1.20 +0.0 1.08 perf-profile.self.cycles-pp.mas_next_slot 1.49 +0.5 2.01 +0.5 1.98 +0.0 1.50 perf-profile.self.cycles-pp.mas_find 0.00 +1.4 1.38 +1.4 1.38 +0.0 0.00 perf-profile.self.cycles-pp.can_modify_mm 3.15 +2.1 5.23 +2.0 5.19 +0.0 3.16 perf-profile.self.cycles-pp.mas_walk > > For everyone: Apologies if you're in the CC list and I didn't CC you, > but I tried to keep my patch set's CC list relatively short and clean > (and I focused on the active participants). > Everyone's comments are very welcome. > > [1]: https://lore.kernel.org/all/20240806212808.1885309-1-pedro.falcato@gmail.com/ > -- > Pedro ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression 2024-08-05 18:55 ` Linus Torvalds 2024-08-05 19:33 ` Linus Torvalds @ 2024-08-05 19:37 ` Jeff Xu 2024-08-05 19:48 ` Linus Torvalds 1 sibling, 1 reply; 29+ messages in thread From: Jeff Xu @ 2024-08-05 19:37 UTC (permalink / raw) To: Linus Torvalds Cc: Michael Ellerman, Nicholas Piggin, Christophe Leroy, Pedro Falcato, kernel test robot, Jeff Xu, oe-lkp, lkp, linux-kernel, Andrew Morton, Kees Cook, Liam R. Howlett, Dave Hansen, Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jonathan Corbet, Jorge Lucangeli Obes, Matthew Wilcox, Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang, feng.tang, fengwei.yin On Mon, Aug 5, 2024 at 12:01 PM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > On Mon, 5 Aug 2024 at 11:11, Jeff Xu <jeffxu@google.com> wrote: > > > > One thing that you can't walk around is that can_modify_mm must be > > called prior to arch_unmap, that means in-place check for the munmap > > is not possible. > > Actually, we should move 'arch_unmap()'. > I think you meant "remove" > There is only one user of it, and it's pretty pointless. > > (Ok, there are two users - x86 also has an 'arch_unmap()', but it's empty). > > The reason I say that the current user of arch_unmap() is pointless is > because this is what the powerpc user does: > > static inline void arch_unmap(struct mm_struct *mm, > unsigned long start, unsigned long end) > { > unsigned long vdso_base = (unsigned long)mm->context.vdso; > > if (start <= vdso_base && vdso_base < end) > mm->context.vdso = NULL; > } > > and that would make sense if we didn't have an actual 'vma' that > matched the vdso. But we do. > > I think this code may predate the whole "create a vma for the vdso" > code. Or maybe it was just always confused. > Agree it is best to remove. > Anyway, what the code *should* do is that we should just have a > ->close() function for special mappings, and call that in > special_mapping_close(). > I'm curious, why does ppc need to unmap vdso ? ( other archs don't have unmap logic.) vdso has .remap, iiuc, that is for CHECKPOINT_RESTORE feature, i.e. during restore, vdso might get relocated after taking from dump. [1] IIUC, vdso mapping doesn't change during the lifetime of the process. Or does it in some user cases ? [1] https://lore.kernel.org/linux-mm/20161101172214.2938-1-dsafonov@virtuozzo.com/ > This is an ENTIRELY UNTESTED patch that gets rid of this horrendous wart. > > Michael / Nick / Christophe? Note that I didn't even compile-test this > on x86-64, much less on powerpc. > > So please consider this a "maybe something like this" patch, but that > 'arch_unmap()' really is pretty nasty. > > Oh, and there was a bug in the error path of the powerpc vdso setup > code anyway. The patch fixes that too, although considering the > entirely untested nature of it, the "fixes" is laughably optimistic. > > Linus ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression 2024-08-05 19:37 ` Jeff Xu @ 2024-08-05 19:48 ` Linus Torvalds 2024-08-05 19:50 ` Linus Torvalds 2024-08-05 23:24 ` Nicholas Piggin 0 siblings, 2 replies; 29+ messages in thread From: Linus Torvalds @ 2024-08-05 19:48 UTC (permalink / raw) To: Jeff Xu Cc: Michael Ellerman, Nicholas Piggin, Christophe Leroy, Pedro Falcato, kernel test robot, Jeff Xu, oe-lkp, lkp, linux-kernel, Andrew Morton, Kees Cook, Liam R. Howlett, Dave Hansen, Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jonathan Corbet, Jorge Lucangeli Obes, Matthew Wilcox, Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang, feng.tang, fengwei.yin On Mon, 5 Aug 2024 at 12:38, Jeff Xu <jeffxu@google.com> wrote: > > I'm curious, why does ppc need to unmap vdso ? ( other archs don't > have unmap logic.) I have no idea. There are comments about 'perf' getting confused about mmap counts when 'context.vdso' isn't set up. But x86 has the same context.vdso logic, and does *not* set the pointer before installing the vma, for example. Also does not zero it out on munmap(), although it does have the mremap logic. For all I know it may all be entirely unnecessary, and could be removed entirely. Linus ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression 2024-08-05 19:48 ` Linus Torvalds @ 2024-08-05 19:50 ` Linus Torvalds 2024-08-05 23:24 ` Nicholas Piggin 1 sibling, 0 replies; 29+ messages in thread From: Linus Torvalds @ 2024-08-05 19:50 UTC (permalink / raw) To: Jeff Xu Cc: Michael Ellerman, Nicholas Piggin, Christophe Leroy, Pedro Falcato, kernel test robot, Jeff Xu, oe-lkp, lkp, linux-kernel, Andrew Morton, Kees Cook, Liam R. Howlett, Dave Hansen, Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jonathan Corbet, Jorge Lucangeli Obes, Matthew Wilcox, Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang, feng.tang, fengwei.yin On Mon, 5 Aug 2024 at 12:48, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > But x86 has the same context.vdso logic, and does *not* set the > pointer before installing the vma, for example. Also does not zero it > out on munmap(), although it does have the mremap logic. Oh, and the empty stale arch_unmap() code on the x86 side has never been about the vdso thing, it was about some horrid MPX notification that no longer exists. In case people wonder like I did. Linus ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression 2024-08-05 19:48 ` Linus Torvalds 2024-08-05 19:50 ` Linus Torvalds @ 2024-08-05 23:24 ` Nicholas Piggin 2024-08-06 0:13 ` Linus Torvalds 1 sibling, 1 reply; 29+ messages in thread From: Nicholas Piggin @ 2024-08-05 23:24 UTC (permalink / raw) To: Linus Torvalds, Jeff Xu Cc: Michael Ellerman, Christophe Leroy, Pedro Falcato, kernel test robot, Jeff Xu, oe-lkp, lkp, linux-kernel, Andrew Morton, Kees Cook, Liam R. Howlett, Dave Hansen, Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jonathan Corbet, Jorge Lucangeli Obes, Matthew Wilcox, Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang, feng.tang, fengwei.yin On Tue Aug 6, 2024 at 5:48 AM AEST, Linus Torvalds wrote: > On Mon, 5 Aug 2024 at 12:38, Jeff Xu <jeffxu@google.com> wrote: > > > > I'm curious, why does ppc need to unmap vdso ? ( other archs don't > > have unmap logic.) > > I have no idea. There are comments about 'perf' getting confused about > mmap counts when 'context.vdso' isn't set up. > > But x86 has the same context.vdso logic, and does *not* set the > pointer before installing the vma, for example. Also does not zero it > out on munmap(), although it does have the mremap logic. > > For all I know it may all be entirely unnecessary, and could be > removed entirely. I don't know much about vdso code, it predated my involvedment in ppc. Commit 83d3f0e90c6c8 says CRIU (checkpoint restore in userspace) is moving it around. Why CRIU wants to do that, I don't know. Can userspace on other archs not unmap their vdsos? Thanks, Nick ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression 2024-08-05 23:24 ` Nicholas Piggin @ 2024-08-06 0:13 ` Linus Torvalds 2024-08-06 1:22 ` Jeff Xu 2024-08-06 2:01 ` Michael Ellerman 0 siblings, 2 replies; 29+ messages in thread From: Linus Torvalds @ 2024-08-06 0:13 UTC (permalink / raw) To: Nicholas Piggin Cc: Jeff Xu, Michael Ellerman, Christophe Leroy, Pedro Falcato, kernel test robot, Jeff Xu, oe-lkp, lkp, linux-kernel, Andrew Morton, Kees Cook, Liam R. Howlett, Dave Hansen, Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jonathan Corbet, Jorge Lucangeli Obes, Matthew Wilcox, Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang, feng.tang, fengwei.yin On Mon, 5 Aug 2024 at 16:25, Nicholas Piggin <npiggin@gmail.com> wrote: > > Can userspace on other archs not unmap their vdsos? I think they can, and nobody cares. The "context.vdso" value stays at some stale value, and anybody who tries to use it will just fail. So what makes powerpc special is not "you can unmap the vdso", but "powerpc cares". I just don't quite know _why_ powerpc cares. Judging by the comments and a quick 'grep', the reason may be arch/powerpc/perf/callchain_32.c which seems to have some vdso knowledge. But x86 does something kind of like that at signal frame generation time, and doesn't care. I really think it's an issue of "if you screw with the vdso, you get to keep both broken pieces". Linus ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression 2024-08-06 0:13 ` Linus Torvalds @ 2024-08-06 1:22 ` Jeff Xu 2024-08-06 2:01 ` Michael Ellerman 1 sibling, 0 replies; 29+ messages in thread From: Jeff Xu @ 2024-08-06 1:22 UTC (permalink / raw) To: Linus Torvalds Cc: Nicholas Piggin, Michael Ellerman, Christophe Leroy, Pedro Falcato, kernel test robot, Jeff Xu, oe-lkp, lkp, linux-kernel, Andrew Morton, Kees Cook, Liam R. Howlett, Dave Hansen, Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jonathan Corbet, Jorge Lucangeli Obes, Matthew Wilcox, Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang, feng.tang, fengwei.yin On Mon, Aug 5, 2024 at 5:13 PM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > On Mon, 5 Aug 2024 at 16:25, Nicholas Piggin <npiggin@gmail.com> wrote: > > > > Can userspace on other archs not unmap their vdsos? > > I think they can, and nobody cares. The "context.vdso" value stays at > some stale value, and anybody who tries to use it will just fail. > I want to seal the vdso :-), so I also care (not having it changeable from userspace) For the restore scenario, if vdso is sealed, I guess CRIU won't be able to relocate the vdso from userspace, I 'm interested in hearing vdso dev's input on this , e.g. is that possible to make CRIU compatible with memory sealing. > So what makes powerpc special is not "you can unmap the vdso", but > "powerpc cares". > > I just don't quite know _why_ powerpc cares. > > Judging by the comments and a quick 'grep', the reason may be > > arch/powerpc/perf/callchain_32.c > > which seems to have some vdso knowledge. > > But x86 does something kind of like that at signal frame generation > time, and doesn't care. > > I really think it's an issue of "if you screw with the vdso, you get > to keep both broken pieces". > > Linus ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression 2024-08-06 0:13 ` Linus Torvalds 2024-08-06 1:22 ` Jeff Xu @ 2024-08-06 2:01 ` Michael Ellerman 2024-08-06 2:15 ` Linus Torvalds 2024-09-13 5:47 ` Christophe Leroy 1 sibling, 2 replies; 29+ messages in thread From: Michael Ellerman @ 2024-08-06 2:01 UTC (permalink / raw) To: Linus Torvalds, Nicholas Piggin Cc: Jeff Xu, Christophe Leroy, Pedro Falcato, kernel test robot, Jeff Xu, oe-lkp, lkp, linux-kernel, Andrew Morton, Kees Cook, Liam R. Howlett, Dave Hansen, Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jonathan Corbet, Jorge Lucangeli Obes, Matthew Wilcox, Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang, feng.tang, fengwei.yin Linus Torvalds <torvalds@linux-foundation.org> writes: > On Mon, 5 Aug 2024 at 16:25, Nicholas Piggin <npiggin@gmail.com> wrote: >> >> Can userspace on other archs not unmap their vdsos? > > I think they can, and nobody cares. The "context.vdso" value stays at > some stale value, and anybody who tries to use it will just fail. > > So what makes powerpc special is not "you can unmap the vdso", but > "powerpc cares". > > I just don't quite know _why_ powerpc cares. AFAIK for CRIU the problem is signal delivery: arch/powerpc/kernel/signal_64.c: int handle_rt_signal64(struct ksignal *ksig, sigset_t *set, struct task_struct *tsk) { ... /* Set up to return from userspace. */ if (tsk->mm->context.vdso) { regs_set_return_ip(regs, VDSO64_SYMBOL(tsk->mm->context.vdso, sigtramp_rt64)); ie. if the VDSO is moved but mm->context.vdso is not updated, signal delivery will crash in userspace. x86-64 always uses SA_RESTORER, and arm64 & s390 can use SA_RESTORER, so I think CRIU uses that to avoid problems with signal delivery when the VDSO is moved. riscv doesn't support SA_RESTORER but I guess CRIU doesn't support riscv yet so it's not become a problem. There was a patch to support SA_RESTORER on powerpc, but I balked at merging it because I couldn't find anyone on the glibc side to say whether they wanted it or not. I guess I should have just merged it. There was an attempt to unify all the vdso stuff and handle the VDSO mremap case in generic code: https://lore.kernel.org/lkml/20210611180242.711399-17-dima@arista.com/ But I think that series got a bit big and complicated and Dmitry had to move on to other things. cheers ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression 2024-08-06 2:01 ` Michael Ellerman @ 2024-08-06 2:15 ` Linus Torvalds 2024-09-13 5:47 ` Christophe Leroy 1 sibling, 0 replies; 29+ messages in thread From: Linus Torvalds @ 2024-08-06 2:15 UTC (permalink / raw) To: Michael Ellerman Cc: Nicholas Piggin, Jeff Xu, Christophe Leroy, Pedro Falcato, kernel test robot, Jeff Xu, oe-lkp, lkp, linux-kernel, Andrew Morton, Kees Cook, Liam R. Howlett, Dave Hansen, Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jonathan Corbet, Jorge Lucangeli Obes, Matthew Wilcox, Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang, feng.tang, fengwei.yin On Mon, 5 Aug 2024 at 19:01, Michael Ellerman <mpe@ellerman.id.au> wrote: > > > > > I just don't quite know _why_ powerpc cares. > > AFAIK for CRIU the problem is signal delivery: Hmm. Well, the patch I sent out should keep it all working. In fact, to some degree it would make it much more straightforward for other architectures to do the same thing. Instead of a random "arch_munmap()" hack, it's a fairly reasonable _install_special_mapping() extension. That said, the *other* thing I don't really understand is the strange "we have to set the context.vdso value before calling install_special_mapping": /* * Put vDSO base into mm struct. We need to do this before calling * install_special_mapping or the perf counter mmap tracking code * will fail to recognise it as a vDSO. */ and that looks odd too. Anyway, I wish we could just get rid of all the horrible signal restore crap. We used to just put it in the stack, and that worked really well apart from the whole WX thing. I wonder if we should just go back to that, and turn the resulting "page fault due to non-executable stack" into a "sigreturn system call". And yes, SA_RESTORER is the right thing. It's basically just user space telling us where it is. And happily, on x86-64 we just forced the issue, and we do /* x86-64 should always use SA_RESTORER. */ if (!(ksig->ka.sa.sa_flags & SA_RESTORER)) return -EFAULT; so you literally cannot have signals without it. Linus ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression 2024-08-06 2:01 ` Michael Ellerman 2024-08-06 2:15 ` Linus Torvalds @ 2024-09-13 5:47 ` Christophe Leroy 1 sibling, 0 replies; 29+ messages in thread From: Christophe Leroy @ 2024-09-13 5:47 UTC (permalink / raw) To: Michael Ellerman, Linus Torvalds, Nicholas Piggin Cc: Jeff Xu, Pedro Falcato, kernel test robot, Jeff Xu, oe-lkp, lkp, linux-kernel, Andrew Morton, Kees Cook, Liam R. Howlett, Dave Hansen, Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jonathan Corbet, Jorge Lucangeli Obes, Matthew Wilcox, Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang, feng.tang, fengwei.yin Le 06/08/2024 à 04:01, Michael Ellerman a écrit : > Linus Torvalds <torvalds@linux-foundation.org> writes: >> On Mon, 5 Aug 2024 at 16:25, Nicholas Piggin <npiggin@gmail.com> wrote: >>> >>> Can userspace on other archs not unmap their vdsos? >> >> I think they can, and nobody cares. The "context.vdso" value stays at >> some stale value, and anybody who tries to use it will just fail. >> >> So what makes powerpc special is not "you can unmap the vdso", but >> "powerpc cares". >> >> I just don't quite know _why_ powerpc cares. > > AFAIK for CRIU the problem is signal delivery: > > arch/powerpc/kernel/signal_64.c: > > int handle_rt_signal64(struct ksignal *ksig, sigset_t *set, > struct task_struct *tsk) > { > ... > /* Set up to return from userspace. */ > if (tsk->mm->context.vdso) { > regs_set_return_ip(regs, VDSO64_SYMBOL(tsk->mm->context.vdso, sigtramp_rt64)); > > > ie. if the VDSO is moved but mm->context.vdso is not updated, signal > delivery will crash in userspace. > > x86-64 always uses SA_RESTORER, and arm64 & s390 can use SA_RESTORER, so > I think CRIU uses that to avoid problems with signal delivery when the > VDSO is moved. > > riscv doesn't support SA_RESTORER but I guess CRIU doesn't support riscv > yet so it's not become a problem. > > There was a patch to support SA_RESTORER on powerpc, but I balked at > merging it because I couldn't find anyone on the glibc side to say > whether they wanted it or not. I guess I should have just merged it. The patch is at https://patchwork.ozlabs.org/project/linuxppc-dev/patch/afe50d1db63a10fde9547ea08fe1fa68b0638aba.1624618157.git.christophe.leroy@csgroup.eu/ It still applies cleanly. Christophe > > There was an attempt to unify all the vdso stuff and handle the > VDSO mremap case in generic code: > > https://lore.kernel.org/lkml/20210611180242.711399-17-dima@arista.com/ > > But I think that series got a bit big and complicated and Dmitry had to > move on to other things. > > cheers ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression 2024-08-04 20:32 ` Linus Torvalds 2024-08-05 13:33 ` Pedro Falcato @ 2024-08-05 17:54 ` Jeff Xu 1 sibling, 0 replies; 29+ messages in thread From: Jeff Xu @ 2024-08-05 17:54 UTC (permalink / raw) To: Linus Torvalds Cc: kernel test robot, oe-lkp, lkp, linux-kernel, Andrew Morton, Kees Cook, Liam R. Howlett, Pedro Falcato, Dave Hansen, Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jeff Xu, Jonathan Corbet, Jorge Lucangeli Obes, Matthew Wilcox, Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang, feng.tang, fengwei.yin On Sun, Aug 4, 2024 at 1:33 PM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > On Sun, 4 Aug 2024 at 01:59, kernel test robot <oliver.sang@intel.com> wrote: > > > > kernel test robot noticed a -4.4% regression of stress-ng.pagemove.page_remaps_per_sec on > > commit 8be7258aad44 ("mseal: add mseal syscall") > > Ok, it's basically just the vma walk in can_modify_mm(): > > > 1.06 +0.1 1.18 perf-profile.self.cycles-pp.mas_next_slot > > 1.50 +0.5 1.97 perf-profile.self.cycles-pp.mas_find > > 0.00 +1.4 1.35 perf-profile.self.cycles-pp.can_modify_mm > > 3.13 +2.0 5.13 perf-profile.self.cycles-pp.mas_walk > > and looks like it's two different pathways. We have __do_sys_mremap -> > mremap_to -> do_munmap -> do_vmi_munmap -> can_modify_mm for the > destination mapping, but we also have mremap_to() calling > can_modify_mm() directly for the source mapping. > There are two scenarios in mremap syscall. 1> mremap_to (relocate vma) 2> shrink/expand. Those two scenarios are handled by different code path: For case 1> mremap_to (relocate vma) -> can_modify_mm , check src for sealing. -> if MREMAP_FIXED ->-> do_munmap (dst) // free dst ->->-> do_vmi_munmap (dst) ->->->-> can_modify_mm (dst) // check dst for sealing -> if dst size is smaller (shrink case) ->-> do_munmap(dst, to remove extra size) ->->-> do_vmi_munmap ->->->-> can_modify_mm(dst) (potentially duplicate with check for MREMAP_FIXED, practically, the memory should be unmapped, so the cost looking for a un-existed memory range in the maple tree ) For case 2> Shrink/Expand. -> can_modify_mm, check addr is sealed -> if dst size is smaller (shrink case) ->-> do_vmi_munmap(remove_extra_size) -> ->-> can_modify_mm(addr) (This is redundant because addr is already checked) For case 2:, potentially we can improve it by passing a flag into do_vmi_munmap() to indicate the sealing is already checked by the caller. (however, this idea have to be tested to show actual gain) The reported regression is in mremap, I wonder why mprotect/munmap doesn't have similar impact, since they use the same pattern (one extra out-of-place check for memory range) During version 9, I tested munmap/mprotect/madvise for perf [1] . The test shows mseal adds 20-40 ns or 50-100 CPU cycle pre call, this is much smaller (one tenth) than change from 5.10 to 6.8. The test is using multiple VMAs with various types[2]. The next step for me is to run the stress-ng.pagemove.page_remaps_per_sec to understand why mremap shows a big regression number. [1] https://lore.kernel.org/all/20240214151130.616240-1-jeffxu@chromium.org/ [2] https://github.com/peaktocreek/mmperf Best regards, -Jeff > And then do_vmi_munmap() will do it's *own* vma_find() after having > done arch_unmap(). > > And do_munmap() will obviously do its own vma lookup as part of > calling vma_to_resize(). > > So it looks like a large portion of this regression is because the > mseal addition just ends up walking the vma list way too much. > > Linus ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression 2024-08-04 8:59 [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression kernel test robot 2024-08-04 20:32 ` Linus Torvalds @ 2024-08-05 13:56 ` Jeff Xu 2024-08-05 16:58 ` Jeff Xu 2 siblings, 0 replies; 29+ messages in thread From: Jeff Xu @ 2024-08-05 13:56 UTC (permalink / raw) To: kernel test robot Cc: oe-lkp, lkp, linux-kernel, Andrew Morton, Kees Cook, Liam R. Howlett, Pedro Falcato, Dave Hansen, Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jeff Xu, Jonathan Corbet, Jorge Lucangeli Obes, Linus Torvalds, Matthew Wilcox, Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang, feng.tang, fengwei.yin On Sun, Aug 4, 2024 at 1:59 AM kernel test robot <oliver.sang@intel.com> wrote: > > > > Hello, > > kernel test robot noticed a -4.4% regression of stress-ng.pagemove.page_remaps_per_sec on: > Looking. I'm setting up the environment so I can repro. . > > commit: 8be7258aad44b5e25977a98db136f677fa6f4370 ("mseal: add mseal syscall") > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master > > testcase: stress-ng > test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory > parameters: > > nr_threads: 100% > testtime: 60s > test: pagemove > cpufreq_governor: performance > > > In addition to that, the commit also has significant impact on the following tests: > > +------------------+---------------------------------------------------------------------------------------------+ > | testcase: change | stress-ng: stress-ng.pkey.ops_per_sec -3.6% regression | > | test machine | 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory | > | test parameters | cpufreq_governor=performance | > | | nr_threads=100% | > | | test=pkey | > | | testtime=60s | > +------------------+---------------------------------------------------------------------------------------------+ > > > If you fix the issue in a separate patch/commit (i.e. not just a new version of > the same patch/commit), kindly add following tags > | Reported-by: kernel test robot <oliver.sang@intel.com> > | Closes: https://lore.kernel.org/oe-lkp/202408041602.caa0372-oliver.sang@intel.com > > > Details are as below: > --------------------------------------------------------------------------------------------------> > > > The kernel config and materials to reproduce are available at: > https://download.01.org/0day-ci/archive/20240804/202408041602.caa0372-oliver.sang@intel.com > > ========================================================================================= > compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: > gcc-13/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/pagemove/stress-ng/60s > > commit: > ff388fe5c4 ("mseal: wire up mseal syscall") > 8be7258aad ("mseal: add mseal syscall") > > ff388fe5c481d39c 8be7258aad44b5e25977a98db13 > ---------------- --------------------------- > %stddev %change %stddev > \ | \ > 41625945 -4.3% 39842322 proc-vmstat.numa_hit > 41559175 -4.3% 39774160 proc-vmstat.numa_local > 77484314 -4.4% 74105555 proc-vmstat.pgalloc_normal > 77205752 -4.4% 73826672 proc-vmstat.pgfree > 18361466 -4.2% 17596652 stress-ng.pagemove.ops > 306014 -4.2% 293262 stress-ng.pagemove.ops_per_sec > 205312 -4.4% 196176 stress-ng.pagemove.page_remaps_per_sec > 4961 +1.0% 5013 stress-ng.time.percent_of_cpu_this_job_got > 2917 +1.2% 2952 stress-ng.time.system_time > 1.07 -6.6% 1.00 perf-stat.i.MPKI > 3.354e+10 +3.5% 3.473e+10 perf-stat.i.branch-instructions > 1.795e+08 -4.2% 1.719e+08 perf-stat.i.cache-misses > 2.376e+08 -4.1% 2.279e+08 perf-stat.i.cache-references > 1.13 -3.0% 1.10 perf-stat.i.cpi > 1077 +4.3% 1124 perf-stat.i.cycles-between-cache-misses > 1.717e+11 +2.7% 1.762e+11 perf-stat.i.instructions > 0.88 +3.1% 0.91 perf-stat.i.ipc > 1.05 -6.8% 0.97 perf-stat.overall.MPKI > 0.25 ą 2% -0.0 0.24 perf-stat.overall.branch-miss-rate% > 1.13 -3.0% 1.10 perf-stat.overall.cpi > 1084 +4.0% 1127 perf-stat.overall.cycles-between-cache-misses > 0.88 +3.1% 0.91 perf-stat.overall.ipc > 3.298e+10 +3.5% 3.415e+10 perf-stat.ps.branch-instructions > 1.764e+08 -4.3% 1.689e+08 perf-stat.ps.cache-misses > 2.336e+08 -4.1% 2.24e+08 perf-stat.ps.cache-references > 194.57 -2.4% 189.96 ą 2% perf-stat.ps.cpu-migrations > 1.688e+11 +2.7% 1.733e+11 perf-stat.ps.instructions > 1.036e+13 +3.0% 1.068e+13 perf-stat.total.instructions > 75.12 -1.9 73.22 perf-profile.calltrace.cycles-pp.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > 36.84 -1.6 35.29 perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 > 24.90 -1.2 23.72 perf-profile.calltrace.cycles-pp.copy_vma.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > 19.89 -0.9 18.98 perf-profile.calltrace.cycles-pp.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > 10.56 ą 2% -0.8 9.78 ą 2% perf-profile.calltrace.cycles-pp.rcu_core.handle_softirqs.run_ksoftirqd.smpboot_thread_fn.kthread > 10.56 ą 2% -0.8 9.79 ą 2% perf-profile.calltrace.cycles-pp.handle_softirqs.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork > 10.56 ą 2% -0.8 9.79 ą 2% perf-profile.calltrace.cycles-pp.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm > 10.57 ą 2% -0.8 9.80 ą 2% perf-profile.calltrace.cycles-pp.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm > 10.52 ą 2% -0.8 9.75 ą 2% perf-profile.calltrace.cycles-pp.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd.smpboot_thread_fn > 10.62 ą 2% -0.8 9.85 ą 2% perf-profile.calltrace.cycles-pp.kthread.ret_from_fork.ret_from_fork_asm > 10.62 ą 2% -0.8 9.85 ą 2% perf-profile.calltrace.cycles-pp.ret_from_fork.ret_from_fork_asm > 10.62 ą 2% -0.8 9.85 ą 2% perf-profile.calltrace.cycles-pp.ret_from_fork_asm > 14.75 -0.7 14.07 perf-profile.calltrace.cycles-pp.vma_merge.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > 1.50 -0.6 0.94 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 > 5.88 ą 2% -0.4 5.47 ą 2% perf-profile.calltrace.cycles-pp.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd > 7.80 -0.3 7.47 perf-profile.calltrace.cycles-pp.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > 4.55 ą 2% -0.3 4.24 ą 2% perf-profile.calltrace.cycles-pp.__slab_free.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs > 6.76 -0.3 6.45 perf-profile.calltrace.cycles-pp.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma > 6.15 -0.3 5.86 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap > 8.22 -0.3 7.93 perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > 6.12 -0.3 5.87 perf-profile.calltrace.cycles-pp.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma > 5.74 -0.2 5.50 perf-profile.calltrace.cycles-pp.move_ptes.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64 > 3.16 ą 2% -0.2 2.94 perf-profile.calltrace.cycles-pp.vm_area_free_rcu_cb.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd > 5.50 -0.2 5.28 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap > 1.36 -0.2 1.14 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap > 5.15 -0.2 4.94 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap > 5.51 -0.2 5.31 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_merge.copy_vma.move_vma.__do_sys_mremap > 5.16 -0.2 4.97 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma.move_vma > 2.24 -0.2 2.05 perf-profile.calltrace.cycles-pp.find_vma_prev.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > 2.60 ą 2% -0.2 2.42 ą 2% perf-profile.calltrace.cycles-pp.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core.handle_softirqs > 4.67 -0.2 4.49 perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma > 3.41 -0.2 3.23 perf-profile.calltrace.cycles-pp.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma > 3.00 -0.2 2.83 ą 2% perf-profile.calltrace.cycles-pp.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma > 0.96 -0.2 0.80 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.do_munmap.mremap_to > 4.04 -0.2 3.88 perf-profile.calltrace.cycles-pp.vm_area_dup.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > 3.20 ą 2% -0.2 3.04 ą 2% perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap > 3.53 -0.1 3.38 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma.__do_sys_mremap > 3.40 -0.1 3.26 perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma.__do_sys_mremap > 2.20 ą 2% -0.1 2.06 ą 2% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap > 1.84 ą 3% -0.1 1.71 ą 3% perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.anon_vma_clone.__split_vma.do_vmi_align_munmap > 1.78 ą 2% -0.1 1.65 ą 3% perf-profile.calltrace.cycles-pp.vma_prepare.vma_merge.copy_vma.move_vma.__do_sys_mremap > 2.69 -0.1 2.56 perf-profile.calltrace.cycles-pp.unlink_anon_vmas.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap > 1.78 ą 2% -0.1 1.66 ą 2% perf-profile.calltrace.cycles-pp.__slab_free.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core > 1.36 ą 2% -0.1 1.23 ą 2% perf-profile.calltrace.cycles-pp.rcu_cblist_dequeue.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd > 0.95 -0.1 0.83 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap > 3.29 -0.1 3.17 perf-profile.calltrace.cycles-pp.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > 2.08 -0.1 1.96 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_merge.copy_vma.move_vma.__do_sys_mremap > 1.43 ą 3% -0.1 1.32 ą 3% perf-profile.calltrace.cycles-pp.down_write.vma_prepare.vma_merge.copy_vma.move_vma > 2.21 -0.1 2.10 perf-profile.calltrace.cycles-pp.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma > 2.47 -0.1 2.36 perf-profile.calltrace.cycles-pp.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma > 2.21 -0.1 2.12 perf-profile.calltrace.cycles-pp.native_flush_tlb_one_user.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables > 1.41 -0.1 1.32 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap > 1.26 -0.1 1.18 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap > 1.82 -0.1 1.75 perf-profile.calltrace.cycles-pp.vma_link.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > 0.71 -0.1 0.63 perf-profile.calltrace.cycles-pp.mas_prev.vma_merge.copy_vma.move_vma.__do_sys_mremap > 1.29 -0.1 1.22 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma > 0.61 -0.1 0.54 perf-profile.calltrace.cycles-pp.mas_prev_setup.mas_prev.vma_merge.copy_vma.move_vma > 1.36 -0.1 1.29 perf-profile.calltrace.cycles-pp.kmem_cache_free.unlink_anon_vmas.free_pgtables.unmap_region.do_vmi_align_munmap > 1.40 -0.1 1.33 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma.move_vma > 0.70 -0.1 0.64 perf-profile.calltrace.cycles-pp.mas_walk.find_vma_prev.copy_vma.move_vma.__do_sys_mremap > 1.23 -0.1 1.17 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma > 1.66 -0.1 1.60 perf-profile.calltrace.cycles-pp.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > 1.16 -0.1 1.10 perf-profile.calltrace.cycles-pp.anon_vma_clone.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > 0.96 -0.1 0.90 perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.unlink_anon_vmas.free_pgtables.unmap_region > 1.14 -0.1 1.08 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap > 0.79 -0.1 0.74 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma > 1.04 -0.1 1.00 perf-profile.calltrace.cycles-pp.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > 0.58 -0.0 0.53 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.mremap > 0.61 -0.0 0.56 perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core > 0.56 -0.0 0.52 perf-profile.calltrace.cycles-pp.mas_find.find_vma_prev.copy_vma.move_vma.__do_sys_mremap > 0.57 -0.0 0.53 ą 2% perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs > 0.78 -0.0 0.74 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge > 0.88 -0.0 0.84 perf-profile.calltrace.cycles-pp.mtree_load.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64 > 0.70 -0.0 0.66 perf-profile.calltrace.cycles-pp.__call_rcu_common.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > 0.68 -0.0 0.64 perf-profile.calltrace.cycles-pp.percpu_counter_add_batch.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > 0.68 -0.0 0.64 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.mremap > 0.97 -0.0 0.93 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > 1.11 -0.0 1.08 perf-profile.calltrace.cycles-pp.clear_bhb_loop.mremap > 0.75 -0.0 0.72 perf-profile.calltrace.cycles-pp.allocate_slab.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma > 0.74 -0.0 0.71 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.anon_vma_clone.copy_vma.move_vma.__do_sys_mremap > 0.60 ą 2% -0.0 0.57 perf-profile.calltrace.cycles-pp.security_mmap_addr.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64 > 0.67 ą 2% -0.0 0.64 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma > 0.82 -0.0 0.79 perf-profile.calltrace.cycles-pp.mtree_load.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > 0.63 -0.0 0.60 perf-profile.calltrace.cycles-pp.mas_prev_slot.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > 0.99 -0.0 0.96 perf-profile.calltrace.cycles-pp.mt_find.vma_merge.copy_vma.move_vma.__do_sys_mremap > 0.62 ą 2% -0.0 0.59 perf-profile.calltrace.cycles-pp.get_old_pud.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64 > 0.87 -0.0 0.84 perf-profile.calltrace.cycles-pp.userfaultfd_unmap_complete.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > 0.78 -0.0 0.75 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_link.copy_vma.move_vma.__do_sys_mremap > 0.64 -0.0 0.62 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_link.copy_vma.move_vma.__do_sys_mremap > 0.90 -0.0 0.87 perf-profile.calltrace.cycles-pp.mtree_load.vma_merge.copy_vma.move_vma.__do_sys_mremap > 0.54 -0.0 0.52 perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.move_ptes.move_page_tables.move_vma.__do_sys_mremap > 1.04 +0.0 1.08 perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region > 0.76 +0.1 0.83 perf-profile.calltrace.cycles-pp.__madvise > 0.63 +0.1 0.70 perf-profile.calltrace.cycles-pp.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise > 0.62 +0.1 0.70 perf-profile.calltrace.cycles-pp.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise > 0.66 +0.1 0.74 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__madvise > 0.66 +0.1 0.74 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise > 87.74 +0.7 88.45 perf-profile.calltrace.cycles-pp.mremap > 0.00 +0.9 0.86 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.do_munmap > 0.00 +0.9 0.86 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.mremap_to.__do_sys_mremap > 84.88 +0.9 85.77 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.mremap > 84.73 +0.9 85.62 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > 0.00 +0.9 0.92 ą 2% perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.move_vma > 83.84 +0.9 84.78 perf-profile.calltrace.cycles-pp.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > 0.00 +1.1 1.06 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64 > 0.00 +1.2 1.21 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to > 2.07 +1.5 3.55 perf-profile.calltrace.cycles-pp.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > 1.58 +1.5 3.07 perf-profile.calltrace.cycles-pp.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64 > 0.00 +1.5 1.52 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap > 0.00 +1.6 1.57 perf-profile.calltrace.cycles-pp.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > 0.00 +1.7 1.72 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap > 0.00 +2.0 2.01 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 > 5.39 +2.9 8.32 perf-profile.calltrace.cycles-pp.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > 75.29 -1.9 73.37 perf-profile.children.cycles-pp.move_vma > 37.06 -1.6 35.50 perf-profile.children.cycles-pp.do_vmi_align_munmap > 24.98 -1.2 23.80 perf-profile.children.cycles-pp.copy_vma > 19.99 -1.0 19.02 perf-profile.children.cycles-pp.handle_softirqs > 19.97 -1.0 19.00 perf-profile.children.cycles-pp.rcu_core > 19.95 -1.0 18.98 perf-profile.children.cycles-pp.rcu_do_batch > 19.98 -0.9 19.06 perf-profile.children.cycles-pp.__split_vma > 17.55 -0.8 16.76 perf-profile.children.cycles-pp.kmem_cache_free > 10.56 ą 2% -0.8 9.79 ą 2% perf-profile.children.cycles-pp.run_ksoftirqd > 10.57 ą 2% -0.8 9.80 ą 2% perf-profile.children.cycles-pp.smpboot_thread_fn > 15.38 -0.8 14.62 perf-profile.children.cycles-pp.kmem_cache_alloc_noprof > 10.62 ą 2% -0.8 9.85 ą 2% perf-profile.children.cycles-pp.kthread > 10.62 ą 2% -0.8 9.86 ą 2% perf-profile.children.cycles-pp.ret_from_fork > 10.62 ą 2% -0.8 9.86 ą 2% perf-profile.children.cycles-pp.ret_from_fork_asm > 15.14 -0.7 14.44 perf-profile.children.cycles-pp.vma_merge > 12.08 -0.5 11.55 perf-profile.children.cycles-pp.__slab_free > 12.11 -0.5 11.62 perf-profile.children.cycles-pp.mas_wr_store_entry > 10.86 -0.5 10.39 perf-profile.children.cycles-pp.vm_area_dup > 11.89 -0.5 11.44 perf-profile.children.cycles-pp.mas_store_prealloc > 8.49 -0.4 8.06 perf-profile.children.cycles-pp.__memcg_slab_post_alloc_hook > 9.88 -0.4 9.49 perf-profile.children.cycles-pp.mas_wr_node_store > 7.91 -0.3 7.58 perf-profile.children.cycles-pp.move_page_tables > 6.06 -0.3 5.78 perf-profile.children.cycles-pp.vm_area_free_rcu_cb > 8.28 -0.3 8.00 perf-profile.children.cycles-pp.unmap_region > 6.69 -0.3 6.42 perf-profile.children.cycles-pp.vma_complete > 5.06 -0.3 4.80 perf-profile.children.cycles-pp.mas_preallocate > 5.82 -0.2 5.57 perf-profile.children.cycles-pp.move_ptes > 4.24 -0.2 4.01 perf-profile.children.cycles-pp.anon_vma_clone > 3.50 -0.2 3.30 perf-profile.children.cycles-pp.down_write > 2.44 -0.2 2.25 perf-profile.children.cycles-pp.find_vma_prev > 3.46 -0.2 3.28 perf-profile.children.cycles-pp.___slab_alloc > 3.45 -0.2 3.27 perf-profile.children.cycles-pp.free_pgtables > 2.54 -0.2 2.37 perf-profile.children.cycles-pp.rcu_cblist_dequeue > 3.35 -0.2 3.18 perf-profile.children.cycles-pp.__memcg_slab_free_hook > 2.93 -0.2 2.78 perf-profile.children.cycles-pp.mas_alloc_nodes > 2.28 ą 2% -0.2 2.12 ą 2% perf-profile.children.cycles-pp.vma_prepare > 3.46 -0.1 3.32 perf-profile.children.cycles-pp.flush_tlb_mm_range > 3.41 -0.1 3.27 ą 2% perf-profile.children.cycles-pp.mod_objcg_state > 2.76 -0.1 2.63 perf-profile.children.cycles-pp.unlink_anon_vmas > 3.41 -0.1 3.28 perf-profile.children.cycles-pp.mas_store_gfp > 2.21 -0.1 2.09 perf-profile.children.cycles-pp.__cond_resched > 2.04 -0.1 1.94 perf-profile.children.cycles-pp.allocate_slab > 2.10 -0.1 2.00 perf-profile.children.cycles-pp.__call_rcu_common > 2.51 -0.1 2.40 perf-profile.children.cycles-pp.flush_tlb_func > 1.04 -0.1 0.94 perf-profile.children.cycles-pp.mas_prev > 2.71 -0.1 2.61 perf-profile.children.cycles-pp.mtree_load > 2.23 -0.1 2.14 perf-profile.children.cycles-pp.native_flush_tlb_one_user > 0.22 ą 5% -0.1 0.13 ą 13% perf-profile.children.cycles-pp.vm_stat_account > 0.95 -0.1 0.87 perf-profile.children.cycles-pp.mas_prev_setup > 1.65 -0.1 1.57 perf-profile.children.cycles-pp.mas_wr_walk > 1.84 -0.1 1.76 perf-profile.children.cycles-pp.up_write > 1.27 -0.1 1.20 perf-profile.children.cycles-pp.mas_prev_slot > 1.84 -0.1 1.77 perf-profile.children.cycles-pp.vma_link > 1.39 -0.1 1.32 perf-profile.children.cycles-pp.shuffle_freelist > 0.96 -0.1 0.90 ą 2% perf-profile.children.cycles-pp.rcu_all_qs > 0.86 -0.1 0.80 perf-profile.children.cycles-pp._raw_spin_lock_irqsave > 1.70 -0.1 1.64 perf-profile.children.cycles-pp.__get_unmapped_area > 0.34 ą 3% -0.1 0.29 ą 5% perf-profile.children.cycles-pp.security_vm_enough_memory_mm > 0.60 -0.0 0.55 perf-profile.children.cycles-pp.entry_SYSCALL_64 > 0.92 -0.0 0.87 perf-profile.children.cycles-pp.percpu_counter_add_batch > 1.07 -0.0 1.02 perf-profile.children.cycles-pp.vma_to_resize > 1.59 -0.0 1.54 perf-profile.children.cycles-pp.mas_update_gap > 0.44 ą 2% -0.0 0.40 ą 2% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath > 0.70 -0.0 0.66 perf-profile.children.cycles-pp.syscall_return_via_sysret > 1.13 -0.0 1.09 perf-profile.children.cycles-pp.mt_find > 0.20 ą 6% -0.0 0.17 ą 9% perf-profile.children.cycles-pp.cap_vm_enough_memory > 0.99 -0.0 0.95 perf-profile.children.cycles-pp.mas_pop_node > 0.63 ą 2% -0.0 0.59 perf-profile.children.cycles-pp.security_mmap_addr > 0.62 -0.0 0.59 perf-profile.children.cycles-pp.__put_partials > 1.17 -0.0 1.14 perf-profile.children.cycles-pp.clear_bhb_loop > 0.46 -0.0 0.43 ą 2% perf-profile.children.cycles-pp.__alloc_pages_noprof > 0.44 -0.0 0.41 ą 2% perf-profile.children.cycles-pp.get_page_from_freelist > 0.90 -0.0 0.87 perf-profile.children.cycles-pp.userfaultfd_unmap_complete > 0.64 ą 2% -0.0 0.62 perf-profile.children.cycles-pp.get_old_pud > 1.07 -0.0 1.05 perf-profile.children.cycles-pp.mas_leaf_max_gap > 0.22 ą 3% -0.0 0.20 ą 2% perf-profile.children.cycles-pp.__rmqueue_pcplist > 0.55 -0.0 0.53 perf-profile.children.cycles-pp.refill_obj_stock > 0.25 -0.0 0.23 ą 3% perf-profile.children.cycles-pp.rmqueue > 0.48 -0.0 0.45 perf-profile.children.cycles-pp.mremap_userfaultfd_prep > 0.33 -0.0 0.30 perf-profile.children.cycles-pp.free_unref_page > 0.46 -0.0 0.44 perf-profile.children.cycles-pp.setup_object > 0.21 ą 3% -0.0 0.19 ą 2% perf-profile.children.cycles-pp.rmqueue_bulk > 0.31 ą 3% -0.0 0.29 perf-profile.children.cycles-pp.__vm_enough_memory > 0.40 -0.0 0.38 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack > 0.36 -0.0 0.35 perf-profile.children.cycles-pp.madvise_vma_behavior > 0.54 -0.0 0.53 ą 2% perf-profile.children.cycles-pp.mas_wr_end_piv > 0.46 -0.0 0.44 ą 2% perf-profile.children.cycles-pp.rcu_segcblist_enqueue > 0.34 -0.0 0.32 ą 2% perf-profile.children.cycles-pp.mas_destroy > 0.28 -0.0 0.26 ą 3% perf-profile.children.cycles-pp.mas_wr_store_setup > 0.30 -0.0 0.28 perf-profile.children.cycles-pp.pte_offset_map_nolock > 0.19 -0.0 0.18 ą 2% perf-profile.children.cycles-pp.__thp_vma_allowable_orders > 0.08 ą 4% -0.0 0.07 perf-profile.children.cycles-pp.ksm_madvise > 0.17 -0.0 0.16 perf-profile.children.cycles-pp.get_any_partial > 0.08 -0.0 0.07 perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare > 0.45 +0.0 0.47 perf-profile.children.cycles-pp._raw_spin_lock > 1.10 +0.0 1.14 perf-profile.children.cycles-pp.zap_pte_range > 0.78 +0.1 0.85 perf-profile.children.cycles-pp.__madvise > 0.63 +0.1 0.70 perf-profile.children.cycles-pp.__x64_sys_madvise > 0.62 +0.1 0.70 perf-profile.children.cycles-pp.do_madvise > 0.00 +0.1 0.09 ą 4% perf-profile.children.cycles-pp.can_modify_mm_madv > 1.32 +0.1 1.46 perf-profile.children.cycles-pp.mas_next_slot > 88.13 +0.7 88.83 perf-profile.children.cycles-pp.mremap > 83.94 +0.9 84.88 perf-profile.children.cycles-pp.__do_sys_mremap > 86.06 +0.9 87.00 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe > 85.56 +1.0 86.54 perf-profile.children.cycles-pp.do_syscall_64 > 40.49 +1.4 41.90 perf-profile.children.cycles-pp.do_vmi_munmap > 2.10 +1.5 3.57 perf-profile.children.cycles-pp.do_munmap > 3.62 +2.3 5.90 perf-profile.children.cycles-pp.mas_walk > 5.44 +2.9 8.38 perf-profile.children.cycles-pp.mremap_to > 5.30 +3.1 8.39 perf-profile.children.cycles-pp.mas_find > 0.00 +5.4 5.40 perf-profile.children.cycles-pp.can_modify_mm > 11.46 -0.5 10.96 perf-profile.self.cycles-pp.__slab_free > 4.30 -0.2 4.08 perf-profile.self.cycles-pp.__memcg_slab_post_alloc_hook > 2.51 -0.2 2.34 perf-profile.self.cycles-pp.rcu_cblist_dequeue > 2.41 ą 2% -0.2 2.25 perf-profile.self.cycles-pp.down_write > 2.21 -0.1 2.11 perf-profile.self.cycles-pp.native_flush_tlb_one_user > 2.37 -0.1 2.28 perf-profile.self.cycles-pp.mtree_load > 1.60 -0.1 1.51 perf-profile.self.cycles-pp.__memcg_slab_free_hook > 0.18 ą 3% -0.1 0.10 ą 15% perf-profile.self.cycles-pp.vm_stat_account > 1.25 -0.1 1.18 perf-profile.self.cycles-pp.move_vma > 1.76 -0.1 1.69 perf-profile.self.cycles-pp.mod_objcg_state > 1.42 -0.1 1.35 ą 2% perf-profile.self.cycles-pp.__call_rcu_common > 1.41 -0.1 1.34 perf-profile.self.cycles-pp.mas_wr_walk > 1.52 -0.1 1.46 perf-profile.self.cycles-pp.up_write > 1.02 -0.1 0.95 perf-profile.self.cycles-pp.mas_prev_slot > 0.96 -0.1 0.90 ą 2% perf-profile.self.cycles-pp.vm_area_free_rcu_cb > 1.50 -0.1 1.45 perf-profile.self.cycles-pp.kmem_cache_free > 0.69 ą 3% -0.1 0.64 ą 2% perf-profile.self.cycles-pp.rcu_all_qs > 1.14 ą 2% -0.1 1.09 perf-profile.self.cycles-pp.shuffle_freelist > 1.10 -0.1 1.05 perf-profile.self.cycles-pp.__cond_resched > 1.40 -0.0 1.35 perf-profile.self.cycles-pp.do_vmi_align_munmap > 0.99 -0.0 0.94 perf-profile.self.cycles-pp.mas_preallocate > 0.88 -0.0 0.83 perf-profile.self.cycles-pp.___slab_alloc > 0.55 -0.0 0.50 perf-profile.self.cycles-pp.mremap_to > 0.98 -0.0 0.93 perf-profile.self.cycles-pp.move_ptes > 0.78 -0.0 0.74 perf-profile.self.cycles-pp.percpu_counter_add_batch > 0.21 ą 2% -0.0 0.18 ą 2% perf-profile.self.cycles-pp.entry_SYSCALL_64 > 0.44 ą 2% -0.0 0.40 ą 2% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath > 0.92 -0.0 0.89 perf-profile.self.cycles-pp.mas_store_gfp > 0.86 -0.0 0.82 perf-profile.self.cycles-pp.mas_pop_node > 0.50 -0.0 0.46 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe > 1.15 -0.0 1.12 perf-profile.self.cycles-pp.clear_bhb_loop > 1.14 -0.0 1.11 perf-profile.self.cycles-pp.vma_merge > 0.66 -0.0 0.63 perf-profile.self.cycles-pp.__split_vma > 0.16 ą 6% -0.0 0.13 ą 7% perf-profile.self.cycles-pp.cap_vm_enough_memory > 0.82 -0.0 0.79 perf-profile.self.cycles-pp.mas_wr_store_entry > 0.54 ą 2% -0.0 0.52 perf-profile.self.cycles-pp.get_old_pud > 0.43 -0.0 0.40 perf-profile.self.cycles-pp.do_munmap > 0.51 ą 2% -0.0 0.48 ą 2% perf-profile.self.cycles-pp.security_mmap_addr > 0.50 -0.0 0.48 perf-profile.self.cycles-pp.refill_obj_stock > 0.24 -0.0 0.22 perf-profile.self.cycles-pp.mas_prev > 0.71 -0.0 0.69 perf-profile.self.cycles-pp.unmap_page_range > 0.48 -0.0 0.45 perf-profile.self.cycles-pp.find_vma_prev > 0.42 -0.0 0.40 perf-profile.self.cycles-pp._raw_spin_lock_irqsave > 0.66 -0.0 0.64 perf-profile.self.cycles-pp.mas_store_prealloc > 0.31 -0.0 0.29 perf-profile.self.cycles-pp.mas_prev_setup > 0.43 -0.0 0.41 perf-profile.self.cycles-pp.mas_wr_end_piv > 0.78 -0.0 0.76 perf-profile.self.cycles-pp.userfaultfd_unmap_complete > 0.28 -0.0 0.26 ą 2% perf-profile.self.cycles-pp.mas_put_in_tree > 0.42 -0.0 0.40 perf-profile.self.cycles-pp.mremap_userfaultfd_prep > 0.28 -0.0 0.26 perf-profile.self.cycles-pp.free_pgtables > 0.39 -0.0 0.37 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack > 0.30 ą 2% -0.0 0.28 perf-profile.self.cycles-pp.zap_pmd_range > 0.32 -0.0 0.31 perf-profile.self.cycles-pp.unmap_vmas > 0.21 -0.0 0.20 perf-profile.self.cycles-pp.__get_unmapped_area > 0.18 ą 2% -0.0 0.17 ą 2% perf-profile.self.cycles-pp.lru_add_drain_cpu > 0.06 -0.0 0.05 perf-profile.self.cycles-pp.ksm_madvise > 0.45 +0.0 0.46 perf-profile.self.cycles-pp.do_vmi_munmap > 0.37 +0.0 0.39 perf-profile.self.cycles-pp._raw_spin_lock > 1.06 +0.1 1.18 perf-profile.self.cycles-pp.mas_next_slot > 1.50 +0.5 1.97 perf-profile.self.cycles-pp.mas_find > 0.00 +1.4 1.35 perf-profile.self.cycles-pp.can_modify_mm > 3.13 +2.0 5.13 perf-profile.self.cycles-pp.mas_walk > > > *************************************************************************************************** > lkp-spr-r02: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory > ========================================================================================= > compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: > gcc-13/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-spr-r02/pkey/stress-ng/60s > > commit: > ff388fe5c4 ("mseal: wire up mseal syscall") > 8be7258aad ("mseal: add mseal syscall") > > ff388fe5c481d39c 8be7258aad44b5e25977a98db13 > ---------------- --------------------------- > %stddev %change %stddev > \ | \ > 10539 -2.5% 10273 vmstat.system.cs > 0.28 ą 5% -20.1% 0.22 ą 7% sched_debug.cfs_rq:/.h_nr_running.stddev > 1419 ą 7% -15.3% 1202 ą 6% sched_debug.cfs_rq:/.util_avg.max > 0.28 ą 6% -18.4% 0.23 ą 8% sched_debug.cpu.nr_running.stddev > 8.736e+08 -3.6% 8.423e+08 stress-ng.pkey.ops > 14560560 -3.6% 14038795 stress-ng.pkey.ops_per_sec > 770.39 ą 4% -5.0% 732.04 stress-ng.time.user_time > 244657 ą 3% +5.8% 258782 ą 3% proc-vmstat.nr_slab_unreclaimable > 73133541 -2.1% 71588873 proc-vmstat.numa_hit > 72873579 -2.1% 71357274 proc-vmstat.numa_local > 1.842e+08 -2.5% 1.796e+08 proc-vmstat.pgalloc_normal > 1.767e+08 -2.8% 1.717e+08 proc-vmstat.pgfree > 1345346 ą 40% -73.1% 362064 ą124% numa-vmstat.node0.nr_inactive_anon > 1345340 ą 40% -73.1% 362062 ą124% numa-vmstat.node0.nr_zone_inactive_anon > 2420830 ą 14% +35.1% 3270248 ą 16% numa-vmstat.node1.nr_file_pages > 2067871 ą 13% +51.5% 3132982 ą 17% numa-vmstat.node1.nr_inactive_anon > 191406 ą 17% +33.6% 255808 ą 14% numa-vmstat.node1.nr_mapped > 2452 ą 61% +104.4% 5012 ą 35% numa-vmstat.node1.nr_page_table_pages > 2067853 ą 13% +51.5% 3132966 ą 17% numa-vmstat.node1.nr_zone_inactive_anon > 5379238 ą 40% -73.0% 1453605 ą123% numa-meminfo.node0.Inactive > 5379166 ą 40% -73.0% 1453462 ą123% numa-meminfo.node0.Inactive(anon) > 8741077 ą 22% -36.7% 5531290 ą 28% numa-meminfo.node0.MemUsed > 9651902 ą 13% +35.8% 13105318 ą 16% numa-meminfo.node1.FilePages > 8239855 ą 13% +52.4% 12556929 ą 17% numa-meminfo.node1.Inactive > 8239712 ą 13% +52.4% 12556853 ą 17% numa-meminfo.node1.Inactive(anon) > 761944 ą 18% +34.6% 1025906 ą 14% numa-meminfo.node1.Mapped > 11679628 ą 11% +31.2% 15322841 ą 14% numa-meminfo.node1.MemUsed > 9874 ą 62% +104.6% 20200 ą 36% numa-meminfo.node1.PageTables > 0.74 -4.2% 0.71 perf-stat.i.MPKI > 1.245e+11 +2.3% 1.274e+11 perf-stat.i.branch-instructions > 0.37 -0.0 0.35 perf-stat.i.branch-miss-rate% > 4.359e+08 -2.1% 4.265e+08 perf-stat.i.branch-misses > 4.672e+08 -2.6% 4.548e+08 perf-stat.i.cache-misses > 7.276e+08 -2.7% 7.082e+08 perf-stat.i.cache-references > 1.00 -1.6% 0.98 perf-stat.i.cpi > 1364 +2.9% 1404 perf-stat.i.cycles-between-cache-misses > 6.392e+11 +1.7% 6.499e+11 perf-stat.i.instructions > 1.00 +1.6% 1.02 perf-stat.i.ipc > 0.74 -4.3% 0.71 perf-stat.overall.MPKI > 0.35 -0.0 0.33 perf-stat.overall.branch-miss-rate% > 1.00 -1.6% 0.99 perf-stat.overall.cpi > 1356 +2.9% 1395 perf-stat.overall.cycles-between-cache-misses > 1.00 +1.6% 1.01 perf-stat.overall.ipc > 1.209e+11 +1.9% 1.232e+11 perf-stat.ps.branch-instructions > 4.188e+08 -2.6% 4.077e+08 perf-stat.ps.branch-misses > 4.585e+08 -3.1% 4.441e+08 perf-stat.ps.cache-misses > 7.124e+08 -3.1% 6.901e+08 perf-stat.ps.cache-references > 10321 -2.6% 10053 perf-stat.ps.context-switches > > > > > > Disclaimer: > Results have been estimated based on internal Intel analysis and are provided > for informational purposes only. Any difference in system hardware or software > design or configuration may affect actual performance. > > > -- > 0-DAY CI Kernel Test Service > https://github.com/intel/lkp-tests/wiki > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression 2024-08-04 8:59 [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression kernel test robot 2024-08-04 20:32 ` Linus Torvalds 2024-08-05 13:56 ` Jeff Xu @ 2024-08-05 16:58 ` Jeff Xu 2024-08-06 1:44 ` Oliver Sang 2 siblings, 1 reply; 29+ messages in thread From: Jeff Xu @ 2024-08-05 16:58 UTC (permalink / raw) To: kernel test robot Cc: oe-lkp, lkp, linux-kernel, Andrew Morton, Kees Cook, Liam R. Howlett, Pedro Falcato, Dave Hansen, Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jeff Xu, Jonathan Corbet, Jorge Lucangeli Obes, Linus Torvalds, Matthew Wilcox, Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang, feng.tang, fengwei.yin On Sun, Aug 4, 2024 at 1:59 AM kernel test robot <oliver.sang@intel.com> wrote: > > > > Hello, > > kernel test robot noticed a -4.4% regression of stress-ng.pagemove.page_remaps_per_sec on: > > > commit: 8be7258aad44b5e25977a98db136f677fa6f4370 ("mseal: add mseal syscall") > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master > > testcase: stress-ng > test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory > parameters: > > nr_threads: 100% > testtime: 60s > test: pagemove > cpufreq_governor: performance > > > In addition to that, the commit also has significant impact on the following tests: > > +------------------+---------------------------------------------------------------------------------------------+ > | testcase: change | stress-ng: stress-ng.pkey.ops_per_sec -3.6% regression | > | test machine | 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory | > | test parameters | cpufreq_governor=performance | > | | nr_threads=100% | > | | test=pkey | > | | testtime=60s | > +------------------+---------------------------------------------------------------------------------------------+ > > > If you fix the issue in a separate patch/commit (i.e. not just a new version of > the same patch/commit), kindly add following tags > | Reported-by: kernel test robot <oliver.sang@intel.com> > | Closes: https://lore.kernel.org/oe-lkp/202408041602.caa0372-oliver.sang@intel.com > > > Details are as below: > --------------------------------------------------------------------------------------------------> > > > The kernel config and materials to reproduce are available at: > https://download.01.org/0day-ci/archive/20240804/202408041602.caa0372-oliver.sang@intel.com > There is an error when I try to reproduce the test: bin/lkp install job.yaml -------------------------------------------------------- Some packages could not be installed. This may mean that you have requested an impossible situation or if you are using the unstable distribution that some required packages have not yet been created or been moved out of Incoming. The following information may help to resolve the situation: The following packages have unmet dependencies: libdw1 : Depends: libelf1 (= 0.190-1+b1) libdw1t64 : Breaks: libdw1 (< 0.191-2) E: Unable to correct problems, you have held broken packages. Cannot install some packages of perf-c2c depends ----------------------------------------------------------------------------------------- And where is stress-ng.pagemove.page_remaps_per_sec test implemented, is that part of lkp-tests ? Thanks -Jeff > ========================================================================================= > compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: > gcc-13/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/pagemove/stress-ng/60s > > commit: > ff388fe5c4 ("mseal: wire up mseal syscall") > 8be7258aad ("mseal: add mseal syscall") > > ff388fe5c481d39c 8be7258aad44b5e25977a98db13 > ---------------- --------------------------- > %stddev %change %stddev > \ | \ > 41625945 -4.3% 39842322 proc-vmstat.numa_hit > 41559175 -4.3% 39774160 proc-vmstat.numa_local > 77484314 -4.4% 74105555 proc-vmstat.pgalloc_normal > 77205752 -4.4% 73826672 proc-vmstat.pgfree > 18361466 -4.2% 17596652 stress-ng.pagemove.ops > 306014 -4.2% 293262 stress-ng.pagemove.ops_per_sec > 205312 -4.4% 196176 stress-ng.pagemove.page_remaps_per_sec > 4961 +1.0% 5013 stress-ng.time.percent_of_cpu_this_job_got > 2917 +1.2% 2952 stress-ng.time.system_time > 1.07 -6.6% 1.00 perf-stat.i.MPKI > 3.354e+10 +3.5% 3.473e+10 perf-stat.i.branch-instructions > 1.795e+08 -4.2% 1.719e+08 perf-stat.i.cache-misses > 2.376e+08 -4.1% 2.279e+08 perf-stat.i.cache-references > 1.13 -3.0% 1.10 perf-stat.i.cpi > 1077 +4.3% 1124 perf-stat.i.cycles-between-cache-misses > 1.717e+11 +2.7% 1.762e+11 perf-stat.i.instructions > 0.88 +3.1% 0.91 perf-stat.i.ipc > 1.05 -6.8% 0.97 perf-stat.overall.MPKI > 0.25 ą 2% -0.0 0.24 perf-stat.overall.branch-miss-rate% > 1.13 -3.0% 1.10 perf-stat.overall.cpi > 1084 +4.0% 1127 perf-stat.overall.cycles-between-cache-misses > 0.88 +3.1% 0.91 perf-stat.overall.ipc > 3.298e+10 +3.5% 3.415e+10 perf-stat.ps.branch-instructions > 1.764e+08 -4.3% 1.689e+08 perf-stat.ps.cache-misses > 2.336e+08 -4.1% 2.24e+08 perf-stat.ps.cache-references > 194.57 -2.4% 189.96 ą 2% perf-stat.ps.cpu-migrations > 1.688e+11 +2.7% 1.733e+11 perf-stat.ps.instructions > 1.036e+13 +3.0% 1.068e+13 perf-stat.total.instructions > 75.12 -1.9 73.22 perf-profile.calltrace.cycles-pp.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > 36.84 -1.6 35.29 perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 > 24.90 -1.2 23.72 perf-profile.calltrace.cycles-pp.copy_vma.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > 19.89 -0.9 18.98 perf-profile.calltrace.cycles-pp.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > 10.56 ą 2% -0.8 9.78 ą 2% perf-profile.calltrace.cycles-pp.rcu_core.handle_softirqs.run_ksoftirqd.smpboot_thread_fn.kthread > 10.56 ą 2% -0.8 9.79 ą 2% perf-profile.calltrace.cycles-pp.handle_softirqs.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork > 10.56 ą 2% -0.8 9.79 ą 2% perf-profile.calltrace.cycles-pp.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm > 10.57 ą 2% -0.8 9.80 ą 2% perf-profile.calltrace.cycles-pp.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm > 10.52 ą 2% -0.8 9.75 ą 2% perf-profile.calltrace.cycles-pp.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd.smpboot_thread_fn > 10.62 ą 2% -0.8 9.85 ą 2% perf-profile.calltrace.cycles-pp.kthread.ret_from_fork.ret_from_fork_asm > 10.62 ą 2% -0.8 9.85 ą 2% perf-profile.calltrace.cycles-pp.ret_from_fork.ret_from_fork_asm > 10.62 ą 2% -0.8 9.85 ą 2% perf-profile.calltrace.cycles-pp.ret_from_fork_asm > 14.75 -0.7 14.07 perf-profile.calltrace.cycles-pp.vma_merge.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > 1.50 -0.6 0.94 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 > 5.88 ą 2% -0.4 5.47 ą 2% perf-profile.calltrace.cycles-pp.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd > 7.80 -0.3 7.47 perf-profile.calltrace.cycles-pp.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > 4.55 ą 2% -0.3 4.24 ą 2% perf-profile.calltrace.cycles-pp.__slab_free.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs > 6.76 -0.3 6.45 perf-profile.calltrace.cycles-pp.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma > 6.15 -0.3 5.86 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap > 8.22 -0.3 7.93 perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > 6.12 -0.3 5.87 perf-profile.calltrace.cycles-pp.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma > 5.74 -0.2 5.50 perf-profile.calltrace.cycles-pp.move_ptes.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64 > 3.16 ą 2% -0.2 2.94 perf-profile.calltrace.cycles-pp.vm_area_free_rcu_cb.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd > 5.50 -0.2 5.28 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap > 1.36 -0.2 1.14 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap > 5.15 -0.2 4.94 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap > 5.51 -0.2 5.31 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_merge.copy_vma.move_vma.__do_sys_mremap > 5.16 -0.2 4.97 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma.move_vma > 2.24 -0.2 2.05 perf-profile.calltrace.cycles-pp.find_vma_prev.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > 2.60 ą 2% -0.2 2.42 ą 2% perf-profile.calltrace.cycles-pp.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core.handle_softirqs > 4.67 -0.2 4.49 perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma > 3.41 -0.2 3.23 perf-profile.calltrace.cycles-pp.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma > 3.00 -0.2 2.83 ą 2% perf-profile.calltrace.cycles-pp.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma > 0.96 -0.2 0.80 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.do_munmap.mremap_to > 4.04 -0.2 3.88 perf-profile.calltrace.cycles-pp.vm_area_dup.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > 3.20 ą 2% -0.2 3.04 ą 2% perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap > 3.53 -0.1 3.38 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma.__do_sys_mremap > 3.40 -0.1 3.26 perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma.__do_sys_mremap > 2.20 ą 2% -0.1 2.06 ą 2% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap > 1.84 ą 3% -0.1 1.71 ą 3% perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.anon_vma_clone.__split_vma.do_vmi_align_munmap > 1.78 ą 2% -0.1 1.65 ą 3% perf-profile.calltrace.cycles-pp.vma_prepare.vma_merge.copy_vma.move_vma.__do_sys_mremap > 2.69 -0.1 2.56 perf-profile.calltrace.cycles-pp.unlink_anon_vmas.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap > 1.78 ą 2% -0.1 1.66 ą 2% perf-profile.calltrace.cycles-pp.__slab_free.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core > 1.36 ą 2% -0.1 1.23 ą 2% perf-profile.calltrace.cycles-pp.rcu_cblist_dequeue.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd > 0.95 -0.1 0.83 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap > 3.29 -0.1 3.17 perf-profile.calltrace.cycles-pp.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > 2.08 -0.1 1.96 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_merge.copy_vma.move_vma.__do_sys_mremap > 1.43 ą 3% -0.1 1.32 ą 3% perf-profile.calltrace.cycles-pp.down_write.vma_prepare.vma_merge.copy_vma.move_vma > 2.21 -0.1 2.10 perf-profile.calltrace.cycles-pp.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma > 2.47 -0.1 2.36 perf-profile.calltrace.cycles-pp.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma > 2.21 -0.1 2.12 perf-profile.calltrace.cycles-pp.native_flush_tlb_one_user.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables > 1.41 -0.1 1.32 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap > 1.26 -0.1 1.18 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap > 1.82 -0.1 1.75 perf-profile.calltrace.cycles-pp.vma_link.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > 0.71 -0.1 0.63 perf-profile.calltrace.cycles-pp.mas_prev.vma_merge.copy_vma.move_vma.__do_sys_mremap > 1.29 -0.1 1.22 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma > 0.61 -0.1 0.54 perf-profile.calltrace.cycles-pp.mas_prev_setup.mas_prev.vma_merge.copy_vma.move_vma > 1.36 -0.1 1.29 perf-profile.calltrace.cycles-pp.kmem_cache_free.unlink_anon_vmas.free_pgtables.unmap_region.do_vmi_align_munmap > 1.40 -0.1 1.33 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma.move_vma > 0.70 -0.1 0.64 perf-profile.calltrace.cycles-pp.mas_walk.find_vma_prev.copy_vma.move_vma.__do_sys_mremap > 1.23 -0.1 1.17 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma > 1.66 -0.1 1.60 perf-profile.calltrace.cycles-pp.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > 1.16 -0.1 1.10 perf-profile.calltrace.cycles-pp.anon_vma_clone.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > 0.96 -0.1 0.90 perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.unlink_anon_vmas.free_pgtables.unmap_region > 1.14 -0.1 1.08 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap > 0.79 -0.1 0.74 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma > 1.04 -0.1 1.00 perf-profile.calltrace.cycles-pp.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > 0.58 -0.0 0.53 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.mremap > 0.61 -0.0 0.56 perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core > 0.56 -0.0 0.52 perf-profile.calltrace.cycles-pp.mas_find.find_vma_prev.copy_vma.move_vma.__do_sys_mremap > 0.57 -0.0 0.53 ą 2% perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs > 0.78 -0.0 0.74 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge > 0.88 -0.0 0.84 perf-profile.calltrace.cycles-pp.mtree_load.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64 > 0.70 -0.0 0.66 perf-profile.calltrace.cycles-pp.__call_rcu_common.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > 0.68 -0.0 0.64 perf-profile.calltrace.cycles-pp.percpu_counter_add_batch.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > 0.68 -0.0 0.64 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.mremap > 0.97 -0.0 0.93 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > 1.11 -0.0 1.08 perf-profile.calltrace.cycles-pp.clear_bhb_loop.mremap > 0.75 -0.0 0.72 perf-profile.calltrace.cycles-pp.allocate_slab.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma > 0.74 -0.0 0.71 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.anon_vma_clone.copy_vma.move_vma.__do_sys_mremap > 0.60 ą 2% -0.0 0.57 perf-profile.calltrace.cycles-pp.security_mmap_addr.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64 > 0.67 ą 2% -0.0 0.64 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma > 0.82 -0.0 0.79 perf-profile.calltrace.cycles-pp.mtree_load.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > 0.63 -0.0 0.60 perf-profile.calltrace.cycles-pp.mas_prev_slot.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > 0.99 -0.0 0.96 perf-profile.calltrace.cycles-pp.mt_find.vma_merge.copy_vma.move_vma.__do_sys_mremap > 0.62 ą 2% -0.0 0.59 perf-profile.calltrace.cycles-pp.get_old_pud.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64 > 0.87 -0.0 0.84 perf-profile.calltrace.cycles-pp.userfaultfd_unmap_complete.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > 0.78 -0.0 0.75 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_link.copy_vma.move_vma.__do_sys_mremap > 0.64 -0.0 0.62 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_link.copy_vma.move_vma.__do_sys_mremap > 0.90 -0.0 0.87 perf-profile.calltrace.cycles-pp.mtree_load.vma_merge.copy_vma.move_vma.__do_sys_mremap > 0.54 -0.0 0.52 perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.move_ptes.move_page_tables.move_vma.__do_sys_mremap > 1.04 +0.0 1.08 perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region > 0.76 +0.1 0.83 perf-profile.calltrace.cycles-pp.__madvise > 0.63 +0.1 0.70 perf-profile.calltrace.cycles-pp.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise > 0.62 +0.1 0.70 perf-profile.calltrace.cycles-pp.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise > 0.66 +0.1 0.74 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__madvise > 0.66 +0.1 0.74 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise > 87.74 +0.7 88.45 perf-profile.calltrace.cycles-pp.mremap > 0.00 +0.9 0.86 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.do_munmap > 0.00 +0.9 0.86 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.mremap_to.__do_sys_mremap > 84.88 +0.9 85.77 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.mremap > 84.73 +0.9 85.62 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > 0.00 +0.9 0.92 ą 2% perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.move_vma > 83.84 +0.9 84.78 perf-profile.calltrace.cycles-pp.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > 0.00 +1.1 1.06 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64 > 0.00 +1.2 1.21 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to > 2.07 +1.5 3.55 perf-profile.calltrace.cycles-pp.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > 1.58 +1.5 3.07 perf-profile.calltrace.cycles-pp.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64 > 0.00 +1.5 1.52 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap > 0.00 +1.6 1.57 perf-profile.calltrace.cycles-pp.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > 0.00 +1.7 1.72 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap > 0.00 +2.0 2.01 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 > 5.39 +2.9 8.32 perf-profile.calltrace.cycles-pp.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > 75.29 -1.9 73.37 perf-profile.children.cycles-pp.move_vma > 37.06 -1.6 35.50 perf-profile.children.cycles-pp.do_vmi_align_munmap > 24.98 -1.2 23.80 perf-profile.children.cycles-pp.copy_vma > 19.99 -1.0 19.02 perf-profile.children.cycles-pp.handle_softirqs > 19.97 -1.0 19.00 perf-profile.children.cycles-pp.rcu_core > 19.95 -1.0 18.98 perf-profile.children.cycles-pp.rcu_do_batch > 19.98 -0.9 19.06 perf-profile.children.cycles-pp.__split_vma > 17.55 -0.8 16.76 perf-profile.children.cycles-pp.kmem_cache_free > 10.56 ą 2% -0.8 9.79 ą 2% perf-profile.children.cycles-pp.run_ksoftirqd > 10.57 ą 2% -0.8 9.80 ą 2% perf-profile.children.cycles-pp.smpboot_thread_fn > 15.38 -0.8 14.62 perf-profile.children.cycles-pp.kmem_cache_alloc_noprof > 10.62 ą 2% -0.8 9.85 ą 2% perf-profile.children.cycles-pp.kthread > 10.62 ą 2% -0.8 9.86 ą 2% perf-profile.children.cycles-pp.ret_from_fork > 10.62 ą 2% -0.8 9.86 ą 2% perf-profile.children.cycles-pp.ret_from_fork_asm > 15.14 -0.7 14.44 perf-profile.children.cycles-pp.vma_merge > 12.08 -0.5 11.55 perf-profile.children.cycles-pp.__slab_free > 12.11 -0.5 11.62 perf-profile.children.cycles-pp.mas_wr_store_entry > 10.86 -0.5 10.39 perf-profile.children.cycles-pp.vm_area_dup > 11.89 -0.5 11.44 perf-profile.children.cycles-pp.mas_store_prealloc > 8.49 -0.4 8.06 perf-profile.children.cycles-pp.__memcg_slab_post_alloc_hook > 9.88 -0.4 9.49 perf-profile.children.cycles-pp.mas_wr_node_store > 7.91 -0.3 7.58 perf-profile.children.cycles-pp.move_page_tables > 6.06 -0.3 5.78 perf-profile.children.cycles-pp.vm_area_free_rcu_cb > 8.28 -0.3 8.00 perf-profile.children.cycles-pp.unmap_region > 6.69 -0.3 6.42 perf-profile.children.cycles-pp.vma_complete > 5.06 -0.3 4.80 perf-profile.children.cycles-pp.mas_preallocate > 5.82 -0.2 5.57 perf-profile.children.cycles-pp.move_ptes > 4.24 -0.2 4.01 perf-profile.children.cycles-pp.anon_vma_clone > 3.50 -0.2 3.30 perf-profile.children.cycles-pp.down_write > 2.44 -0.2 2.25 perf-profile.children.cycles-pp.find_vma_prev > 3.46 -0.2 3.28 perf-profile.children.cycles-pp.___slab_alloc > 3.45 -0.2 3.27 perf-profile.children.cycles-pp.free_pgtables > 2.54 -0.2 2.37 perf-profile.children.cycles-pp.rcu_cblist_dequeue > 3.35 -0.2 3.18 perf-profile.children.cycles-pp.__memcg_slab_free_hook > 2.93 -0.2 2.78 perf-profile.children.cycles-pp.mas_alloc_nodes > 2.28 ą 2% -0.2 2.12 ą 2% perf-profile.children.cycles-pp.vma_prepare > 3.46 -0.1 3.32 perf-profile.children.cycles-pp.flush_tlb_mm_range > 3.41 -0.1 3.27 ą 2% perf-profile.children.cycles-pp.mod_objcg_state > 2.76 -0.1 2.63 perf-profile.children.cycles-pp.unlink_anon_vmas > 3.41 -0.1 3.28 perf-profile.children.cycles-pp.mas_store_gfp > 2.21 -0.1 2.09 perf-profile.children.cycles-pp.__cond_resched > 2.04 -0.1 1.94 perf-profile.children.cycles-pp.allocate_slab > 2.10 -0.1 2.00 perf-profile.children.cycles-pp.__call_rcu_common > 2.51 -0.1 2.40 perf-profile.children.cycles-pp.flush_tlb_func > 1.04 -0.1 0.94 perf-profile.children.cycles-pp.mas_prev > 2.71 -0.1 2.61 perf-profile.children.cycles-pp.mtree_load > 2.23 -0.1 2.14 perf-profile.children.cycles-pp.native_flush_tlb_one_user > 0.22 ą 5% -0.1 0.13 ą 13% perf-profile.children.cycles-pp.vm_stat_account > 0.95 -0.1 0.87 perf-profile.children.cycles-pp.mas_prev_setup > 1.65 -0.1 1.57 perf-profile.children.cycles-pp.mas_wr_walk > 1.84 -0.1 1.76 perf-profile.children.cycles-pp.up_write > 1.27 -0.1 1.20 perf-profile.children.cycles-pp.mas_prev_slot > 1.84 -0.1 1.77 perf-profile.children.cycles-pp.vma_link > 1.39 -0.1 1.32 perf-profile.children.cycles-pp.shuffle_freelist > 0.96 -0.1 0.90 ą 2% perf-profile.children.cycles-pp.rcu_all_qs > 0.86 -0.1 0.80 perf-profile.children.cycles-pp._raw_spin_lock_irqsave > 1.70 -0.1 1.64 perf-profile.children.cycles-pp.__get_unmapped_area > 0.34 ą 3% -0.1 0.29 ą 5% perf-profile.children.cycles-pp.security_vm_enough_memory_mm > 0.60 -0.0 0.55 perf-profile.children.cycles-pp.entry_SYSCALL_64 > 0.92 -0.0 0.87 perf-profile.children.cycles-pp.percpu_counter_add_batch > 1.07 -0.0 1.02 perf-profile.children.cycles-pp.vma_to_resize > 1.59 -0.0 1.54 perf-profile.children.cycles-pp.mas_update_gap > 0.44 ą 2% -0.0 0.40 ą 2% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath > 0.70 -0.0 0.66 perf-profile.children.cycles-pp.syscall_return_via_sysret > 1.13 -0.0 1.09 perf-profile.children.cycles-pp.mt_find > 0.20 ą 6% -0.0 0.17 ą 9% perf-profile.children.cycles-pp.cap_vm_enough_memory > 0.99 -0.0 0.95 perf-profile.children.cycles-pp.mas_pop_node > 0.63 ą 2% -0.0 0.59 perf-profile.children.cycles-pp.security_mmap_addr > 0.62 -0.0 0.59 perf-profile.children.cycles-pp.__put_partials > 1.17 -0.0 1.14 perf-profile.children.cycles-pp.clear_bhb_loop > 0.46 -0.0 0.43 ą 2% perf-profile.children.cycles-pp.__alloc_pages_noprof > 0.44 -0.0 0.41 ą 2% perf-profile.children.cycles-pp.get_page_from_freelist > 0.90 -0.0 0.87 perf-profile.children.cycles-pp.userfaultfd_unmap_complete > 0.64 ą 2% -0.0 0.62 perf-profile.children.cycles-pp.get_old_pud > 1.07 -0.0 1.05 perf-profile.children.cycles-pp.mas_leaf_max_gap > 0.22 ą 3% -0.0 0.20 ą 2% perf-profile.children.cycles-pp.__rmqueue_pcplist > 0.55 -0.0 0.53 perf-profile.children.cycles-pp.refill_obj_stock > 0.25 -0.0 0.23 ą 3% perf-profile.children.cycles-pp.rmqueue > 0.48 -0.0 0.45 perf-profile.children.cycles-pp.mremap_userfaultfd_prep > 0.33 -0.0 0.30 perf-profile.children.cycles-pp.free_unref_page > 0.46 -0.0 0.44 perf-profile.children.cycles-pp.setup_object > 0.21 ą 3% -0.0 0.19 ą 2% perf-profile.children.cycles-pp.rmqueue_bulk > 0.31 ą 3% -0.0 0.29 perf-profile.children.cycles-pp.__vm_enough_memory > 0.40 -0.0 0.38 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack > 0.36 -0.0 0.35 perf-profile.children.cycles-pp.madvise_vma_behavior > 0.54 -0.0 0.53 ą 2% perf-profile.children.cycles-pp.mas_wr_end_piv > 0.46 -0.0 0.44 ą 2% perf-profile.children.cycles-pp.rcu_segcblist_enqueue > 0.34 -0.0 0.32 ą 2% perf-profile.children.cycles-pp.mas_destroy > 0.28 -0.0 0.26 ą 3% perf-profile.children.cycles-pp.mas_wr_store_setup > 0.30 -0.0 0.28 perf-profile.children.cycles-pp.pte_offset_map_nolock > 0.19 -0.0 0.18 ą 2% perf-profile.children.cycles-pp.__thp_vma_allowable_orders > 0.08 ą 4% -0.0 0.07 perf-profile.children.cycles-pp.ksm_madvise > 0.17 -0.0 0.16 perf-profile.children.cycles-pp.get_any_partial > 0.08 -0.0 0.07 perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare > 0.45 +0.0 0.47 perf-profile.children.cycles-pp._raw_spin_lock > 1.10 +0.0 1.14 perf-profile.children.cycles-pp.zap_pte_range > 0.78 +0.1 0.85 perf-profile.children.cycles-pp.__madvise > 0.63 +0.1 0.70 perf-profile.children.cycles-pp.__x64_sys_madvise > 0.62 +0.1 0.70 perf-profile.children.cycles-pp.do_madvise > 0.00 +0.1 0.09 ą 4% perf-profile.children.cycles-pp.can_modify_mm_madv > 1.32 +0.1 1.46 perf-profile.children.cycles-pp.mas_next_slot > 88.13 +0.7 88.83 perf-profile.children.cycles-pp.mremap > 83.94 +0.9 84.88 perf-profile.children.cycles-pp.__do_sys_mremap > 86.06 +0.9 87.00 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe > 85.56 +1.0 86.54 perf-profile.children.cycles-pp.do_syscall_64 > 40.49 +1.4 41.90 perf-profile.children.cycles-pp.do_vmi_munmap > 2.10 +1.5 3.57 perf-profile.children.cycles-pp.do_munmap > 3.62 +2.3 5.90 perf-profile.children.cycles-pp.mas_walk > 5.44 +2.9 8.38 perf-profile.children.cycles-pp.mremap_to > 5.30 +3.1 8.39 perf-profile.children.cycles-pp.mas_find > 0.00 +5.4 5.40 perf-profile.children.cycles-pp.can_modify_mm > 11.46 -0.5 10.96 perf-profile.self.cycles-pp.__slab_free > 4.30 -0.2 4.08 perf-profile.self.cycles-pp.__memcg_slab_post_alloc_hook > 2.51 -0.2 2.34 perf-profile.self.cycles-pp.rcu_cblist_dequeue > 2.41 ą 2% -0.2 2.25 perf-profile.self.cycles-pp.down_write > 2.21 -0.1 2.11 perf-profile.self.cycles-pp.native_flush_tlb_one_user > 2.37 -0.1 2.28 perf-profile.self.cycles-pp.mtree_load > 1.60 -0.1 1.51 perf-profile.self.cycles-pp.__memcg_slab_free_hook > 0.18 ą 3% -0.1 0.10 ą 15% perf-profile.self.cycles-pp.vm_stat_account > 1.25 -0.1 1.18 perf-profile.self.cycles-pp.move_vma > 1.76 -0.1 1.69 perf-profile.self.cycles-pp.mod_objcg_state > 1.42 -0.1 1.35 ą 2% perf-profile.self.cycles-pp.__call_rcu_common > 1.41 -0.1 1.34 perf-profile.self.cycles-pp.mas_wr_walk > 1.52 -0.1 1.46 perf-profile.self.cycles-pp.up_write > 1.02 -0.1 0.95 perf-profile.self.cycles-pp.mas_prev_slot > 0.96 -0.1 0.90 ą 2% perf-profile.self.cycles-pp.vm_area_free_rcu_cb > 1.50 -0.1 1.45 perf-profile.self.cycles-pp.kmem_cache_free > 0.69 ą 3% -0.1 0.64 ą 2% perf-profile.self.cycles-pp.rcu_all_qs > 1.14 ą 2% -0.1 1.09 perf-profile.self.cycles-pp.shuffle_freelist > 1.10 -0.1 1.05 perf-profile.self.cycles-pp.__cond_resched > 1.40 -0.0 1.35 perf-profile.self.cycles-pp.do_vmi_align_munmap > 0.99 -0.0 0.94 perf-profile.self.cycles-pp.mas_preallocate > 0.88 -0.0 0.83 perf-profile.self.cycles-pp.___slab_alloc > 0.55 -0.0 0.50 perf-profile.self.cycles-pp.mremap_to > 0.98 -0.0 0.93 perf-profile.self.cycles-pp.move_ptes > 0.78 -0.0 0.74 perf-profile.self.cycles-pp.percpu_counter_add_batch > 0.21 ą 2% -0.0 0.18 ą 2% perf-profile.self.cycles-pp.entry_SYSCALL_64 > 0.44 ą 2% -0.0 0.40 ą 2% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath > 0.92 -0.0 0.89 perf-profile.self.cycles-pp.mas_store_gfp > 0.86 -0.0 0.82 perf-profile.self.cycles-pp.mas_pop_node > 0.50 -0.0 0.46 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe > 1.15 -0.0 1.12 perf-profile.self.cycles-pp.clear_bhb_loop > 1.14 -0.0 1.11 perf-profile.self.cycles-pp.vma_merge > 0.66 -0.0 0.63 perf-profile.self.cycles-pp.__split_vma > 0.16 ą 6% -0.0 0.13 ą 7% perf-profile.self.cycles-pp.cap_vm_enough_memory > 0.82 -0.0 0.79 perf-profile.self.cycles-pp.mas_wr_store_entry > 0.54 ą 2% -0.0 0.52 perf-profile.self.cycles-pp.get_old_pud > 0.43 -0.0 0.40 perf-profile.self.cycles-pp.do_munmap > 0.51 ą 2% -0.0 0.48 ą 2% perf-profile.self.cycles-pp.security_mmap_addr > 0.50 -0.0 0.48 perf-profile.self.cycles-pp.refill_obj_stock > 0.24 -0.0 0.22 perf-profile.self.cycles-pp.mas_prev > 0.71 -0.0 0.69 perf-profile.self.cycles-pp.unmap_page_range > 0.48 -0.0 0.45 perf-profile.self.cycles-pp.find_vma_prev > 0.42 -0.0 0.40 perf-profile.self.cycles-pp._raw_spin_lock_irqsave > 0.66 -0.0 0.64 perf-profile.self.cycles-pp.mas_store_prealloc > 0.31 -0.0 0.29 perf-profile.self.cycles-pp.mas_prev_setup > 0.43 -0.0 0.41 perf-profile.self.cycles-pp.mas_wr_end_piv > 0.78 -0.0 0.76 perf-profile.self.cycles-pp.userfaultfd_unmap_complete > 0.28 -0.0 0.26 ą 2% perf-profile.self.cycles-pp.mas_put_in_tree > 0.42 -0.0 0.40 perf-profile.self.cycles-pp.mremap_userfaultfd_prep > 0.28 -0.0 0.26 perf-profile.self.cycles-pp.free_pgtables > 0.39 -0.0 0.37 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack > 0.30 ą 2% -0.0 0.28 perf-profile.self.cycles-pp.zap_pmd_range > 0.32 -0.0 0.31 perf-profile.self.cycles-pp.unmap_vmas > 0.21 -0.0 0.20 perf-profile.self.cycles-pp.__get_unmapped_area > 0.18 ą 2% -0.0 0.17 ą 2% perf-profile.self.cycles-pp.lru_add_drain_cpu > 0.06 -0.0 0.05 perf-profile.self.cycles-pp.ksm_madvise > 0.45 +0.0 0.46 perf-profile.self.cycles-pp.do_vmi_munmap > 0.37 +0.0 0.39 perf-profile.self.cycles-pp._raw_spin_lock > 1.06 +0.1 1.18 perf-profile.self.cycles-pp.mas_next_slot > 1.50 +0.5 1.97 perf-profile.self.cycles-pp.mas_find > 0.00 +1.4 1.35 perf-profile.self.cycles-pp.can_modify_mm > 3.13 +2.0 5.13 perf-profile.self.cycles-pp.mas_walk > > > *************************************************************************************************** > lkp-spr-r02: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory > ========================================================================================= > compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: > gcc-13/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-spr-r02/pkey/stress-ng/60s > > commit: > ff388fe5c4 ("mseal: wire up mseal syscall") > 8be7258aad ("mseal: add mseal syscall") > > ff388fe5c481d39c 8be7258aad44b5e25977a98db13 > ---------------- --------------------------- > %stddev %change %stddev > \ | \ > 10539 -2.5% 10273 vmstat.system.cs > 0.28 ą 5% -20.1% 0.22 ą 7% sched_debug.cfs_rq:/.h_nr_running.stddev > 1419 ą 7% -15.3% 1202 ą 6% sched_debug.cfs_rq:/.util_avg.max > 0.28 ą 6% -18.4% 0.23 ą 8% sched_debug.cpu.nr_running.stddev > 8.736e+08 -3.6% 8.423e+08 stress-ng.pkey.ops > 14560560 -3.6% 14038795 stress-ng.pkey.ops_per_sec > 770.39 ą 4% -5.0% 732.04 stress-ng.time.user_time > 244657 ą 3% +5.8% 258782 ą 3% proc-vmstat.nr_slab_unreclaimable > 73133541 -2.1% 71588873 proc-vmstat.numa_hit > 72873579 -2.1% 71357274 proc-vmstat.numa_local > 1.842e+08 -2.5% 1.796e+08 proc-vmstat.pgalloc_normal > 1.767e+08 -2.8% 1.717e+08 proc-vmstat.pgfree > 1345346 ą 40% -73.1% 362064 ą124% numa-vmstat.node0.nr_inactive_anon > 1345340 ą 40% -73.1% 362062 ą124% numa-vmstat.node0.nr_zone_inactive_anon > 2420830 ą 14% +35.1% 3270248 ą 16% numa-vmstat.node1.nr_file_pages > 2067871 ą 13% +51.5% 3132982 ą 17% numa-vmstat.node1.nr_inactive_anon > 191406 ą 17% +33.6% 255808 ą 14% numa-vmstat.node1.nr_mapped > 2452 ą 61% +104.4% 5012 ą 35% numa-vmstat.node1.nr_page_table_pages > 2067853 ą 13% +51.5% 3132966 ą 17% numa-vmstat.node1.nr_zone_inactive_anon > 5379238 ą 40% -73.0% 1453605 ą123% numa-meminfo.node0.Inactive > 5379166 ą 40% -73.0% 1453462 ą123% numa-meminfo.node0.Inactive(anon) > 8741077 ą 22% -36.7% 5531290 ą 28% numa-meminfo.node0.MemUsed > 9651902 ą 13% +35.8% 13105318 ą 16% numa-meminfo.node1.FilePages > 8239855 ą 13% +52.4% 12556929 ą 17% numa-meminfo.node1.Inactive > 8239712 ą 13% +52.4% 12556853 ą 17% numa-meminfo.node1.Inactive(anon) > 761944 ą 18% +34.6% 1025906 ą 14% numa-meminfo.node1.Mapped > 11679628 ą 11% +31.2% 15322841 ą 14% numa-meminfo.node1.MemUsed > 9874 ą 62% +104.6% 20200 ą 36% numa-meminfo.node1.PageTables > 0.74 -4.2% 0.71 perf-stat.i.MPKI > 1.245e+11 +2.3% 1.274e+11 perf-stat.i.branch-instructions > 0.37 -0.0 0.35 perf-stat.i.branch-miss-rate% > 4.359e+08 -2.1% 4.265e+08 perf-stat.i.branch-misses > 4.672e+08 -2.6% 4.548e+08 perf-stat.i.cache-misses > 7.276e+08 -2.7% 7.082e+08 perf-stat.i.cache-references > 1.00 -1.6% 0.98 perf-stat.i.cpi > 1364 +2.9% 1404 perf-stat.i.cycles-between-cache-misses > 6.392e+11 +1.7% 6.499e+11 perf-stat.i.instructions > 1.00 +1.6% 1.02 perf-stat.i.ipc > 0.74 -4.3% 0.71 perf-stat.overall.MPKI > 0.35 -0.0 0.33 perf-stat.overall.branch-miss-rate% > 1.00 -1.6% 0.99 perf-stat.overall.cpi > 1356 +2.9% 1395 perf-stat.overall.cycles-between-cache-misses > 1.00 +1.6% 1.01 perf-stat.overall.ipc > 1.209e+11 +1.9% 1.232e+11 perf-stat.ps.branch-instructions > 4.188e+08 -2.6% 4.077e+08 perf-stat.ps.branch-misses > 4.585e+08 -3.1% 4.441e+08 perf-stat.ps.cache-misses > 7.124e+08 -3.1% 6.901e+08 perf-stat.ps.cache-references > 10321 -2.6% 10053 perf-stat.ps.context-switches > > > > > > Disclaimer: > Results have been estimated based on internal Intel analysis and are provided > for informational purposes only. Any difference in system hardware or software > design or configuration may affect actual performance. > > > -- > 0-DAY CI Kernel Test Service > https://github.com/intel/lkp-tests/wiki > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression 2024-08-05 16:58 ` Jeff Xu @ 2024-08-06 1:44 ` Oliver Sang 2024-08-06 14:54 ` Jeff Xu 0 siblings, 1 reply; 29+ messages in thread From: Oliver Sang @ 2024-08-06 1:44 UTC (permalink / raw) To: Jeff Xu Cc: oe-lkp, lkp, linux-kernel, Andrew Morton, Kees Cook, Liam R. Howlett, Pedro Falcato, Dave Hansen, Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jeff Xu, Jonathan Corbet, Jorge Lucangeli Obes, Linus Torvalds, Matthew Wilcox, Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang, feng.tang, fengwei.yin, oliver.sang hi, Jeff, On Mon, Aug 05, 2024 at 09:58:33AM -0700, Jeff Xu wrote: > On Sun, Aug 4, 2024 at 1:59 AM kernel test robot <oliver.sang@intel.com> wrote: > > > > > > > > Hello, > > > > kernel test robot noticed a -4.4% regression of stress-ng.pagemove.page_remaps_per_sec on: > > > > > > commit: 8be7258aad44b5e25977a98db136f677fa6f4370 ("mseal: add mseal syscall") > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master > > > > testcase: stress-ng > > test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory > > parameters: > > > > nr_threads: 100% > > testtime: 60s > > test: pagemove > > cpufreq_governor: performance > > > > > > In addition to that, the commit also has significant impact on the following tests: > > > > +------------------+---------------------------------------------------------------------------------------------+ > > | testcase: change | stress-ng: stress-ng.pkey.ops_per_sec -3.6% regression | > > | test machine | 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory | > > | test parameters | cpufreq_governor=performance | > > | | nr_threads=100% | > > | | test=pkey | > > | | testtime=60s | > > +------------------+---------------------------------------------------------------------------------------------+ > > > > > > If you fix the issue in a separate patch/commit (i.e. not just a new version of > > the same patch/commit), kindly add following tags > > | Reported-by: kernel test robot <oliver.sang@intel.com> > > | Closes: https://lore.kernel.org/oe-lkp/202408041602.caa0372-oliver.sang@intel.com > > > > > > Details are as below: > > --------------------------------------------------------------------------------------------------> > > > > > > The kernel config and materials to reproduce are available at: > > https://download.01.org/0day-ci/archive/20240804/202408041602.caa0372-oliver.sang@intel.com > > > There is an error when I try to reproduce the test: what's your os? we support some distributions https://github.com/intel/lkp-tests?tab=readme-ov-file#supported-distributions > > bin/lkp install job.yaml > > -------------------------------------------------------- > Some packages could not be installed. This may mean that you have > requested an impossible situation or if you are using the unstable > distribution that some required packages have not yet been created > or been moved out of Incoming. > The following information may help to resolve the situation: > > The following packages have unmet dependencies: > libdw1 : Depends: libelf1 (= 0.190-1+b1) > libdw1t64 : Breaks: libdw1 (< 0.191-2) > E: Unable to correct problems, you have held broken packages. > Cannot install some packages of perf-c2c depends > ----------------------------------------------------------------------------------------- > > And where is stress-ng.pagemove.page_remaps_per_sec test implemented, > is that part of lkp-tests ? stress-ng is in https://github.com/ColinIanKing/stress-ng > > Thanks > -Jeff > > > ========================================================================================= > > compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: > > gcc-13/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/pagemove/stress-ng/60s > > > > commit: > > ff388fe5c4 ("mseal: wire up mseal syscall") > > 8be7258aad ("mseal: add mseal syscall") > > > > ff388fe5c481d39c 8be7258aad44b5e25977a98db13 > > ---------------- --------------------------- > > %stddev %change %stddev > > \ | \ > > 41625945 -4.3% 39842322 proc-vmstat.numa_hit > > 41559175 -4.3% 39774160 proc-vmstat.numa_local > > 77484314 -4.4% 74105555 proc-vmstat.pgalloc_normal > > 77205752 -4.4% 73826672 proc-vmstat.pgfree > > 18361466 -4.2% 17596652 stress-ng.pagemove.ops > > 306014 -4.2% 293262 stress-ng.pagemove.ops_per_sec > > 205312 -4.4% 196176 stress-ng.pagemove.page_remaps_per_sec > > 4961 +1.0% 5013 stress-ng.time.percent_of_cpu_this_job_got > > 2917 +1.2% 2952 stress-ng.time.system_time > > 1.07 -6.6% 1.00 perf-stat.i.MPKI > > 3.354e+10 +3.5% 3.473e+10 perf-stat.i.branch-instructions > > 1.795e+08 -4.2% 1.719e+08 perf-stat.i.cache-misses > > 2.376e+08 -4.1% 2.279e+08 perf-stat.i.cache-references > > 1.13 -3.0% 1.10 perf-stat.i.cpi > > 1077 +4.3% 1124 perf-stat.i.cycles-between-cache-misses > > 1.717e+11 +2.7% 1.762e+11 perf-stat.i.instructions > > 0.88 +3.1% 0.91 perf-stat.i.ipc > > 1.05 -6.8% 0.97 perf-stat.overall.MPKI > > 0.25 ą 2% -0.0 0.24 perf-stat.overall.branch-miss-rate% > > 1.13 -3.0% 1.10 perf-stat.overall.cpi > > 1084 +4.0% 1127 perf-stat.overall.cycles-between-cache-misses > > 0.88 +3.1% 0.91 perf-stat.overall.ipc > > 3.298e+10 +3.5% 3.415e+10 perf-stat.ps.branch-instructions > > 1.764e+08 -4.3% 1.689e+08 perf-stat.ps.cache-misses > > 2.336e+08 -4.1% 2.24e+08 perf-stat.ps.cache-references > > 194.57 -2.4% 189.96 ą 2% perf-stat.ps.cpu-migrations > > 1.688e+11 +2.7% 1.733e+11 perf-stat.ps.instructions > > 1.036e+13 +3.0% 1.068e+13 perf-stat.total.instructions > > 75.12 -1.9 73.22 perf-profile.calltrace.cycles-pp.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > > 36.84 -1.6 35.29 perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 > > 24.90 -1.2 23.72 perf-profile.calltrace.cycles-pp.copy_vma.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 19.89 -0.9 18.98 perf-profile.calltrace.cycles-pp.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > > 10.56 ą 2% -0.8 9.78 ą 2% perf-profile.calltrace.cycles-pp.rcu_core.handle_softirqs.run_ksoftirqd.smpboot_thread_fn.kthread > > 10.56 ą 2% -0.8 9.79 ą 2% perf-profile.calltrace.cycles-pp.handle_softirqs.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork > > 10.56 ą 2% -0.8 9.79 ą 2% perf-profile.calltrace.cycles-pp.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm > > 10.57 ą 2% -0.8 9.80 ą 2% perf-profile.calltrace.cycles-pp.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm > > 10.52 ą 2% -0.8 9.75 ą 2% perf-profile.calltrace.cycles-pp.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd.smpboot_thread_fn > > 10.62 ą 2% -0.8 9.85 ą 2% perf-profile.calltrace.cycles-pp.kthread.ret_from_fork.ret_from_fork_asm > > 10.62 ą 2% -0.8 9.85 ą 2% perf-profile.calltrace.cycles-pp.ret_from_fork.ret_from_fork_asm > > 10.62 ą 2% -0.8 9.85 ą 2% perf-profile.calltrace.cycles-pp.ret_from_fork_asm > > 14.75 -0.7 14.07 perf-profile.calltrace.cycles-pp.vma_merge.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > > 1.50 -0.6 0.94 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 > > 5.88 ą 2% -0.4 5.47 ą 2% perf-profile.calltrace.cycles-pp.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd > > 7.80 -0.3 7.47 perf-profile.calltrace.cycles-pp.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 4.55 ą 2% -0.3 4.24 ą 2% perf-profile.calltrace.cycles-pp.__slab_free.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs > > 6.76 -0.3 6.45 perf-profile.calltrace.cycles-pp.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma > > 6.15 -0.3 5.86 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap > > 8.22 -0.3 7.93 perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > > 6.12 -0.3 5.87 perf-profile.calltrace.cycles-pp.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma > > 5.74 -0.2 5.50 perf-profile.calltrace.cycles-pp.move_ptes.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64 > > 3.16 ą 2% -0.2 2.94 perf-profile.calltrace.cycles-pp.vm_area_free_rcu_cb.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd > > 5.50 -0.2 5.28 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap > > 1.36 -0.2 1.14 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap > > 5.15 -0.2 4.94 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap > > 5.51 -0.2 5.31 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_merge.copy_vma.move_vma.__do_sys_mremap > > 5.16 -0.2 4.97 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma.move_vma > > 2.24 -0.2 2.05 perf-profile.calltrace.cycles-pp.find_vma_prev.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > > 2.60 ą 2% -0.2 2.42 ą 2% perf-profile.calltrace.cycles-pp.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core.handle_softirqs > > 4.67 -0.2 4.49 perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma > > 3.41 -0.2 3.23 perf-profile.calltrace.cycles-pp.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma > > 3.00 -0.2 2.83 ą 2% perf-profile.calltrace.cycles-pp.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma > > 0.96 -0.2 0.80 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.do_munmap.mremap_to > > 4.04 -0.2 3.88 perf-profile.calltrace.cycles-pp.vm_area_dup.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > > 3.20 ą 2% -0.2 3.04 ą 2% perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap > > 3.53 -0.1 3.38 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma.__do_sys_mremap > > 3.40 -0.1 3.26 perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma.__do_sys_mremap > > 2.20 ą 2% -0.1 2.06 ą 2% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap > > 1.84 ą 3% -0.1 1.71 ą 3% perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.anon_vma_clone.__split_vma.do_vmi_align_munmap > > 1.78 ą 2% -0.1 1.65 ą 3% perf-profile.calltrace.cycles-pp.vma_prepare.vma_merge.copy_vma.move_vma.__do_sys_mremap > > 2.69 -0.1 2.56 perf-profile.calltrace.cycles-pp.unlink_anon_vmas.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap > > 1.78 ą 2% -0.1 1.66 ą 2% perf-profile.calltrace.cycles-pp.__slab_free.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core > > 1.36 ą 2% -0.1 1.23 ą 2% perf-profile.calltrace.cycles-pp.rcu_cblist_dequeue.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd > > 0.95 -0.1 0.83 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap > > 3.29 -0.1 3.17 perf-profile.calltrace.cycles-pp.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > > 2.08 -0.1 1.96 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_merge.copy_vma.move_vma.__do_sys_mremap > > 1.43 ą 3% -0.1 1.32 ą 3% perf-profile.calltrace.cycles-pp.down_write.vma_prepare.vma_merge.copy_vma.move_vma > > 2.21 -0.1 2.10 perf-profile.calltrace.cycles-pp.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma > > 2.47 -0.1 2.36 perf-profile.calltrace.cycles-pp.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma > > 2.21 -0.1 2.12 perf-profile.calltrace.cycles-pp.native_flush_tlb_one_user.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables > > 1.41 -0.1 1.32 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap > > 1.26 -0.1 1.18 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap > > 1.82 -0.1 1.75 perf-profile.calltrace.cycles-pp.vma_link.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > > 0.71 -0.1 0.63 perf-profile.calltrace.cycles-pp.mas_prev.vma_merge.copy_vma.move_vma.__do_sys_mremap > > 1.29 -0.1 1.22 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma > > 0.61 -0.1 0.54 perf-profile.calltrace.cycles-pp.mas_prev_setup.mas_prev.vma_merge.copy_vma.move_vma > > 1.36 -0.1 1.29 perf-profile.calltrace.cycles-pp.kmem_cache_free.unlink_anon_vmas.free_pgtables.unmap_region.do_vmi_align_munmap > > 1.40 -0.1 1.33 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma.move_vma > > 0.70 -0.1 0.64 perf-profile.calltrace.cycles-pp.mas_walk.find_vma_prev.copy_vma.move_vma.__do_sys_mremap > > 1.23 -0.1 1.17 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma > > 1.66 -0.1 1.60 perf-profile.calltrace.cycles-pp.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 1.16 -0.1 1.10 perf-profile.calltrace.cycles-pp.anon_vma_clone.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > > 0.96 -0.1 0.90 perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.unlink_anon_vmas.free_pgtables.unmap_region > > 1.14 -0.1 1.08 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap > > 0.79 -0.1 0.74 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma > > 1.04 -0.1 1.00 perf-profile.calltrace.cycles-pp.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 0.58 -0.0 0.53 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.mremap > > 0.61 -0.0 0.56 perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core > > 0.56 -0.0 0.52 perf-profile.calltrace.cycles-pp.mas_find.find_vma_prev.copy_vma.move_vma.__do_sys_mremap > > 0.57 -0.0 0.53 ą 2% perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs > > 0.78 -0.0 0.74 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge > > 0.88 -0.0 0.84 perf-profile.calltrace.cycles-pp.mtree_load.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64 > > 0.70 -0.0 0.66 perf-profile.calltrace.cycles-pp.__call_rcu_common.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > > 0.68 -0.0 0.64 perf-profile.calltrace.cycles-pp.percpu_counter_add_batch.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > > 0.68 -0.0 0.64 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.mremap > > 0.97 -0.0 0.93 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > > 1.11 -0.0 1.08 perf-profile.calltrace.cycles-pp.clear_bhb_loop.mremap > > 0.75 -0.0 0.72 perf-profile.calltrace.cycles-pp.allocate_slab.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma > > 0.74 -0.0 0.71 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.anon_vma_clone.copy_vma.move_vma.__do_sys_mremap > > 0.60 ą 2% -0.0 0.57 perf-profile.calltrace.cycles-pp.security_mmap_addr.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64 > > 0.67 ą 2% -0.0 0.64 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma > > 0.82 -0.0 0.79 perf-profile.calltrace.cycles-pp.mtree_load.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > > 0.63 -0.0 0.60 perf-profile.calltrace.cycles-pp.mas_prev_slot.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > > 0.99 -0.0 0.96 perf-profile.calltrace.cycles-pp.mt_find.vma_merge.copy_vma.move_vma.__do_sys_mremap > > 0.62 ą 2% -0.0 0.59 perf-profile.calltrace.cycles-pp.get_old_pud.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64 > > 0.87 -0.0 0.84 perf-profile.calltrace.cycles-pp.userfaultfd_unmap_complete.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > > 0.78 -0.0 0.75 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_link.copy_vma.move_vma.__do_sys_mremap > > 0.64 -0.0 0.62 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_link.copy_vma.move_vma.__do_sys_mremap > > 0.90 -0.0 0.87 perf-profile.calltrace.cycles-pp.mtree_load.vma_merge.copy_vma.move_vma.__do_sys_mremap > > 0.54 -0.0 0.52 perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.move_ptes.move_page_tables.move_vma.__do_sys_mremap > > 1.04 +0.0 1.08 perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region > > 0.76 +0.1 0.83 perf-profile.calltrace.cycles-pp.__madvise > > 0.63 +0.1 0.70 perf-profile.calltrace.cycles-pp.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise > > 0.62 +0.1 0.70 perf-profile.calltrace.cycles-pp.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise > > 0.66 +0.1 0.74 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__madvise > > 0.66 +0.1 0.74 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise > > 87.74 +0.7 88.45 perf-profile.calltrace.cycles-pp.mremap > > 0.00 +0.9 0.86 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.do_munmap > > 0.00 +0.9 0.86 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.mremap_to.__do_sys_mremap > > 84.88 +0.9 85.77 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.mremap > > 84.73 +0.9 85.62 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > > 0.00 +0.9 0.92 ą 2% perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.move_vma > > 83.84 +0.9 84.78 perf-profile.calltrace.cycles-pp.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > > 0.00 +1.1 1.06 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64 > > 0.00 +1.2 1.21 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to > > 2.07 +1.5 3.55 perf-profile.calltrace.cycles-pp.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 1.58 +1.5 3.07 perf-profile.calltrace.cycles-pp.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64 > > 0.00 +1.5 1.52 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap > > 0.00 +1.6 1.57 perf-profile.calltrace.cycles-pp.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 0.00 +1.7 1.72 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap > > 0.00 +2.0 2.01 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 > > 5.39 +2.9 8.32 perf-profile.calltrace.cycles-pp.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > > 75.29 -1.9 73.37 perf-profile.children.cycles-pp.move_vma > > 37.06 -1.6 35.50 perf-profile.children.cycles-pp.do_vmi_align_munmap > > 24.98 -1.2 23.80 perf-profile.children.cycles-pp.copy_vma > > 19.99 -1.0 19.02 perf-profile.children.cycles-pp.handle_softirqs > > 19.97 -1.0 19.00 perf-profile.children.cycles-pp.rcu_core > > 19.95 -1.0 18.98 perf-profile.children.cycles-pp.rcu_do_batch > > 19.98 -0.9 19.06 perf-profile.children.cycles-pp.__split_vma > > 17.55 -0.8 16.76 perf-profile.children.cycles-pp.kmem_cache_free > > 10.56 ą 2% -0.8 9.79 ą 2% perf-profile.children.cycles-pp.run_ksoftirqd > > 10.57 ą 2% -0.8 9.80 ą 2% perf-profile.children.cycles-pp.smpboot_thread_fn > > 15.38 -0.8 14.62 perf-profile.children.cycles-pp.kmem_cache_alloc_noprof > > 10.62 ą 2% -0.8 9.85 ą 2% perf-profile.children.cycles-pp.kthread > > 10.62 ą 2% -0.8 9.86 ą 2% perf-profile.children.cycles-pp.ret_from_fork > > 10.62 ą 2% -0.8 9.86 ą 2% perf-profile.children.cycles-pp.ret_from_fork_asm > > 15.14 -0.7 14.44 perf-profile.children.cycles-pp.vma_merge > > 12.08 -0.5 11.55 perf-profile.children.cycles-pp.__slab_free > > 12.11 -0.5 11.62 perf-profile.children.cycles-pp.mas_wr_store_entry > > 10.86 -0.5 10.39 perf-profile.children.cycles-pp.vm_area_dup > > 11.89 -0.5 11.44 perf-profile.children.cycles-pp.mas_store_prealloc > > 8.49 -0.4 8.06 perf-profile.children.cycles-pp.__memcg_slab_post_alloc_hook > > 9.88 -0.4 9.49 perf-profile.children.cycles-pp.mas_wr_node_store > > 7.91 -0.3 7.58 perf-profile.children.cycles-pp.move_page_tables > > 6.06 -0.3 5.78 perf-profile.children.cycles-pp.vm_area_free_rcu_cb > > 8.28 -0.3 8.00 perf-profile.children.cycles-pp.unmap_region > > 6.69 -0.3 6.42 perf-profile.children.cycles-pp.vma_complete > > 5.06 -0.3 4.80 perf-profile.children.cycles-pp.mas_preallocate > > 5.82 -0.2 5.57 perf-profile.children.cycles-pp.move_ptes > > 4.24 -0.2 4.01 perf-profile.children.cycles-pp.anon_vma_clone > > 3.50 -0.2 3.30 perf-profile.children.cycles-pp.down_write > > 2.44 -0.2 2.25 perf-profile.children.cycles-pp.find_vma_prev > > 3.46 -0.2 3.28 perf-profile.children.cycles-pp.___slab_alloc > > 3.45 -0.2 3.27 perf-profile.children.cycles-pp.free_pgtables > > 2.54 -0.2 2.37 perf-profile.children.cycles-pp.rcu_cblist_dequeue > > 3.35 -0.2 3.18 perf-profile.children.cycles-pp.__memcg_slab_free_hook > > 2.93 -0.2 2.78 perf-profile.children.cycles-pp.mas_alloc_nodes > > 2.28 ą 2% -0.2 2.12 ą 2% perf-profile.children.cycles-pp.vma_prepare > > 3.46 -0.1 3.32 perf-profile.children.cycles-pp.flush_tlb_mm_range > > 3.41 -0.1 3.27 ą 2% perf-profile.children.cycles-pp.mod_objcg_state > > 2.76 -0.1 2.63 perf-profile.children.cycles-pp.unlink_anon_vmas > > 3.41 -0.1 3.28 perf-profile.children.cycles-pp.mas_store_gfp > > 2.21 -0.1 2.09 perf-profile.children.cycles-pp.__cond_resched > > 2.04 -0.1 1.94 perf-profile.children.cycles-pp.allocate_slab > > 2.10 -0.1 2.00 perf-profile.children.cycles-pp.__call_rcu_common > > 2.51 -0.1 2.40 perf-profile.children.cycles-pp.flush_tlb_func > > 1.04 -0.1 0.94 perf-profile.children.cycles-pp.mas_prev > > 2.71 -0.1 2.61 perf-profile.children.cycles-pp.mtree_load > > 2.23 -0.1 2.14 perf-profile.children.cycles-pp.native_flush_tlb_one_user > > 0.22 ą 5% -0.1 0.13 ą 13% perf-profile.children.cycles-pp.vm_stat_account > > 0.95 -0.1 0.87 perf-profile.children.cycles-pp.mas_prev_setup > > 1.65 -0.1 1.57 perf-profile.children.cycles-pp.mas_wr_walk > > 1.84 -0.1 1.76 perf-profile.children.cycles-pp.up_write > > 1.27 -0.1 1.20 perf-profile.children.cycles-pp.mas_prev_slot > > 1.84 -0.1 1.77 perf-profile.children.cycles-pp.vma_link > > 1.39 -0.1 1.32 perf-profile.children.cycles-pp.shuffle_freelist > > 0.96 -0.1 0.90 ą 2% perf-profile.children.cycles-pp.rcu_all_qs > > 0.86 -0.1 0.80 perf-profile.children.cycles-pp._raw_spin_lock_irqsave > > 1.70 -0.1 1.64 perf-profile.children.cycles-pp.__get_unmapped_area > > 0.34 ą 3% -0.1 0.29 ą 5% perf-profile.children.cycles-pp.security_vm_enough_memory_mm > > 0.60 -0.0 0.55 perf-profile.children.cycles-pp.entry_SYSCALL_64 > > 0.92 -0.0 0.87 perf-profile.children.cycles-pp.percpu_counter_add_batch > > 1.07 -0.0 1.02 perf-profile.children.cycles-pp.vma_to_resize > > 1.59 -0.0 1.54 perf-profile.children.cycles-pp.mas_update_gap > > 0.44 ą 2% -0.0 0.40 ą 2% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath > > 0.70 -0.0 0.66 perf-profile.children.cycles-pp.syscall_return_via_sysret > > 1.13 -0.0 1.09 perf-profile.children.cycles-pp.mt_find > > 0.20 ą 6% -0.0 0.17 ą 9% perf-profile.children.cycles-pp.cap_vm_enough_memory > > 0.99 -0.0 0.95 perf-profile.children.cycles-pp.mas_pop_node > > 0.63 ą 2% -0.0 0.59 perf-profile.children.cycles-pp.security_mmap_addr > > 0.62 -0.0 0.59 perf-profile.children.cycles-pp.__put_partials > > 1.17 -0.0 1.14 perf-profile.children.cycles-pp.clear_bhb_loop > > 0.46 -0.0 0.43 ą 2% perf-profile.children.cycles-pp.__alloc_pages_noprof > > 0.44 -0.0 0.41 ą 2% perf-profile.children.cycles-pp.get_page_from_freelist > > 0.90 -0.0 0.87 perf-profile.children.cycles-pp.userfaultfd_unmap_complete > > 0.64 ą 2% -0.0 0.62 perf-profile.children.cycles-pp.get_old_pud > > 1.07 -0.0 1.05 perf-profile.children.cycles-pp.mas_leaf_max_gap > > 0.22 ą 3% -0.0 0.20 ą 2% perf-profile.children.cycles-pp.__rmqueue_pcplist > > 0.55 -0.0 0.53 perf-profile.children.cycles-pp.refill_obj_stock > > 0.25 -0.0 0.23 ą 3% perf-profile.children.cycles-pp.rmqueue > > 0.48 -0.0 0.45 perf-profile.children.cycles-pp.mremap_userfaultfd_prep > > 0.33 -0.0 0.30 perf-profile.children.cycles-pp.free_unref_page > > 0.46 -0.0 0.44 perf-profile.children.cycles-pp.setup_object > > 0.21 ą 3% -0.0 0.19 ą 2% perf-profile.children.cycles-pp.rmqueue_bulk > > 0.31 ą 3% -0.0 0.29 perf-profile.children.cycles-pp.__vm_enough_memory > > 0.40 -0.0 0.38 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack > > 0.36 -0.0 0.35 perf-profile.children.cycles-pp.madvise_vma_behavior > > 0.54 -0.0 0.53 ą 2% perf-profile.children.cycles-pp.mas_wr_end_piv > > 0.46 -0.0 0.44 ą 2% perf-profile.children.cycles-pp.rcu_segcblist_enqueue > > 0.34 -0.0 0.32 ą 2% perf-profile.children.cycles-pp.mas_destroy > > 0.28 -0.0 0.26 ą 3% perf-profile.children.cycles-pp.mas_wr_store_setup > > 0.30 -0.0 0.28 perf-profile.children.cycles-pp.pte_offset_map_nolock > > 0.19 -0.0 0.18 ą 2% perf-profile.children.cycles-pp.__thp_vma_allowable_orders > > 0.08 ą 4% -0.0 0.07 perf-profile.children.cycles-pp.ksm_madvise > > 0.17 -0.0 0.16 perf-profile.children.cycles-pp.get_any_partial > > 0.08 -0.0 0.07 perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare > > 0.45 +0.0 0.47 perf-profile.children.cycles-pp._raw_spin_lock > > 1.10 +0.0 1.14 perf-profile.children.cycles-pp.zap_pte_range > > 0.78 +0.1 0.85 perf-profile.children.cycles-pp.__madvise > > 0.63 +0.1 0.70 perf-profile.children.cycles-pp.__x64_sys_madvise > > 0.62 +0.1 0.70 perf-profile.children.cycles-pp.do_madvise > > 0.00 +0.1 0.09 ą 4% perf-profile.children.cycles-pp.can_modify_mm_madv > > 1.32 +0.1 1.46 perf-profile.children.cycles-pp.mas_next_slot > > 88.13 +0.7 88.83 perf-profile.children.cycles-pp.mremap > > 83.94 +0.9 84.88 perf-profile.children.cycles-pp.__do_sys_mremap > > 86.06 +0.9 87.00 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe > > 85.56 +1.0 86.54 perf-profile.children.cycles-pp.do_syscall_64 > > 40.49 +1.4 41.90 perf-profile.children.cycles-pp.do_vmi_munmap > > 2.10 +1.5 3.57 perf-profile.children.cycles-pp.do_munmap > > 3.62 +2.3 5.90 perf-profile.children.cycles-pp.mas_walk > > 5.44 +2.9 8.38 perf-profile.children.cycles-pp.mremap_to > > 5.30 +3.1 8.39 perf-profile.children.cycles-pp.mas_find > > 0.00 +5.4 5.40 perf-profile.children.cycles-pp.can_modify_mm > > 11.46 -0.5 10.96 perf-profile.self.cycles-pp.__slab_free > > 4.30 -0.2 4.08 perf-profile.self.cycles-pp.__memcg_slab_post_alloc_hook > > 2.51 -0.2 2.34 perf-profile.self.cycles-pp.rcu_cblist_dequeue > > 2.41 ą 2% -0.2 2.25 perf-profile.self.cycles-pp.down_write > > 2.21 -0.1 2.11 perf-profile.self.cycles-pp.native_flush_tlb_one_user > > 2.37 -0.1 2.28 perf-profile.self.cycles-pp.mtree_load > > 1.60 -0.1 1.51 perf-profile.self.cycles-pp.__memcg_slab_free_hook > > 0.18 ą 3% -0.1 0.10 ą 15% perf-profile.self.cycles-pp.vm_stat_account > > 1.25 -0.1 1.18 perf-profile.self.cycles-pp.move_vma > > 1.76 -0.1 1.69 perf-profile.self.cycles-pp.mod_objcg_state > > 1.42 -0.1 1.35 ą 2% perf-profile.self.cycles-pp.__call_rcu_common > > 1.41 -0.1 1.34 perf-profile.self.cycles-pp.mas_wr_walk > > 1.52 -0.1 1.46 perf-profile.self.cycles-pp.up_write > > 1.02 -0.1 0.95 perf-profile.self.cycles-pp.mas_prev_slot > > 0.96 -0.1 0.90 ą 2% perf-profile.self.cycles-pp.vm_area_free_rcu_cb > > 1.50 -0.1 1.45 perf-profile.self.cycles-pp.kmem_cache_free > > 0.69 ą 3% -0.1 0.64 ą 2% perf-profile.self.cycles-pp.rcu_all_qs > > 1.14 ą 2% -0.1 1.09 perf-profile.self.cycles-pp.shuffle_freelist > > 1.10 -0.1 1.05 perf-profile.self.cycles-pp.__cond_resched > > 1.40 -0.0 1.35 perf-profile.self.cycles-pp.do_vmi_align_munmap > > 0.99 -0.0 0.94 perf-profile.self.cycles-pp.mas_preallocate > > 0.88 -0.0 0.83 perf-profile.self.cycles-pp.___slab_alloc > > 0.55 -0.0 0.50 perf-profile.self.cycles-pp.mremap_to > > 0.98 -0.0 0.93 perf-profile.self.cycles-pp.move_ptes > > 0.78 -0.0 0.74 perf-profile.self.cycles-pp.percpu_counter_add_batch > > 0.21 ą 2% -0.0 0.18 ą 2% perf-profile.self.cycles-pp.entry_SYSCALL_64 > > 0.44 ą 2% -0.0 0.40 ą 2% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath > > 0.92 -0.0 0.89 perf-profile.self.cycles-pp.mas_store_gfp > > 0.86 -0.0 0.82 perf-profile.self.cycles-pp.mas_pop_node > > 0.50 -0.0 0.46 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe > > 1.15 -0.0 1.12 perf-profile.self.cycles-pp.clear_bhb_loop > > 1.14 -0.0 1.11 perf-profile.self.cycles-pp.vma_merge > > 0.66 -0.0 0.63 perf-profile.self.cycles-pp.__split_vma > > 0.16 ą 6% -0.0 0.13 ą 7% perf-profile.self.cycles-pp.cap_vm_enough_memory > > 0.82 -0.0 0.79 perf-profile.self.cycles-pp.mas_wr_store_entry > > 0.54 ą 2% -0.0 0.52 perf-profile.self.cycles-pp.get_old_pud > > 0.43 -0.0 0.40 perf-profile.self.cycles-pp.do_munmap > > 0.51 ą 2% -0.0 0.48 ą 2% perf-profile.self.cycles-pp.security_mmap_addr > > 0.50 -0.0 0.48 perf-profile.self.cycles-pp.refill_obj_stock > > 0.24 -0.0 0.22 perf-profile.self.cycles-pp.mas_prev > > 0.71 -0.0 0.69 perf-profile.self.cycles-pp.unmap_page_range > > 0.48 -0.0 0.45 perf-profile.self.cycles-pp.find_vma_prev > > 0.42 -0.0 0.40 perf-profile.self.cycles-pp._raw_spin_lock_irqsave > > 0.66 -0.0 0.64 perf-profile.self.cycles-pp.mas_store_prealloc > > 0.31 -0.0 0.29 perf-profile.self.cycles-pp.mas_prev_setup > > 0.43 -0.0 0.41 perf-profile.self.cycles-pp.mas_wr_end_piv > > 0.78 -0.0 0.76 perf-profile.self.cycles-pp.userfaultfd_unmap_complete > > 0.28 -0.0 0.26 ą 2% perf-profile.self.cycles-pp.mas_put_in_tree > > 0.42 -0.0 0.40 perf-profile.self.cycles-pp.mremap_userfaultfd_prep > > 0.28 -0.0 0.26 perf-profile.self.cycles-pp.free_pgtables > > 0.39 -0.0 0.37 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack > > 0.30 ą 2% -0.0 0.28 perf-profile.self.cycles-pp.zap_pmd_range > > 0.32 -0.0 0.31 perf-profile.self.cycles-pp.unmap_vmas > > 0.21 -0.0 0.20 perf-profile.self.cycles-pp.__get_unmapped_area > > 0.18 ą 2% -0.0 0.17 ą 2% perf-profile.self.cycles-pp.lru_add_drain_cpu > > 0.06 -0.0 0.05 perf-profile.self.cycles-pp.ksm_madvise > > 0.45 +0.0 0.46 perf-profile.self.cycles-pp.do_vmi_munmap > > 0.37 +0.0 0.39 perf-profile.self.cycles-pp._raw_spin_lock > > 1.06 +0.1 1.18 perf-profile.self.cycles-pp.mas_next_slot > > 1.50 +0.5 1.97 perf-profile.self.cycles-pp.mas_find > > 0.00 +1.4 1.35 perf-profile.self.cycles-pp.can_modify_mm > > 3.13 +2.0 5.13 perf-profile.self.cycles-pp.mas_walk > > > > > > *************************************************************************************************** > > lkp-spr-r02: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory > > ========================================================================================= > > compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: > > gcc-13/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-spr-r02/pkey/stress-ng/60s > > > > commit: > > ff388fe5c4 ("mseal: wire up mseal syscall") > > 8be7258aad ("mseal: add mseal syscall") > > > > ff388fe5c481d39c 8be7258aad44b5e25977a98db13 > > ---------------- --------------------------- > > %stddev %change %stddev > > \ | \ > > 10539 -2.5% 10273 vmstat.system.cs > > 0.28 ą 5% -20.1% 0.22 ą 7% sched_debug.cfs_rq:/.h_nr_running.stddev > > 1419 ą 7% -15.3% 1202 ą 6% sched_debug.cfs_rq:/.util_avg.max > > 0.28 ą 6% -18.4% 0.23 ą 8% sched_debug.cpu.nr_running.stddev > > 8.736e+08 -3.6% 8.423e+08 stress-ng.pkey.ops > > 14560560 -3.6% 14038795 stress-ng.pkey.ops_per_sec > > 770.39 ą 4% -5.0% 732.04 stress-ng.time.user_time > > 244657 ą 3% +5.8% 258782 ą 3% proc-vmstat.nr_slab_unreclaimable > > 73133541 -2.1% 71588873 proc-vmstat.numa_hit > > 72873579 -2.1% 71357274 proc-vmstat.numa_local > > 1.842e+08 -2.5% 1.796e+08 proc-vmstat.pgalloc_normal > > 1.767e+08 -2.8% 1.717e+08 proc-vmstat.pgfree > > 1345346 ą 40% -73.1% 362064 ą124% numa-vmstat.node0.nr_inactive_anon > > 1345340 ą 40% -73.1% 362062 ą124% numa-vmstat.node0.nr_zone_inactive_anon > > 2420830 ą 14% +35.1% 3270248 ą 16% numa-vmstat.node1.nr_file_pages > > 2067871 ą 13% +51.5% 3132982 ą 17% numa-vmstat.node1.nr_inactive_anon > > 191406 ą 17% +33.6% 255808 ą 14% numa-vmstat.node1.nr_mapped > > 2452 ą 61% +104.4% 5012 ą 35% numa-vmstat.node1.nr_page_table_pages > > 2067853 ą 13% +51.5% 3132966 ą 17% numa-vmstat.node1.nr_zone_inactive_anon > > 5379238 ą 40% -73.0% 1453605 ą123% numa-meminfo.node0.Inactive > > 5379166 ą 40% -73.0% 1453462 ą123% numa-meminfo.node0.Inactive(anon) > > 8741077 ą 22% -36.7% 5531290 ą 28% numa-meminfo.node0.MemUsed > > 9651902 ą 13% +35.8% 13105318 ą 16% numa-meminfo.node1.FilePages > > 8239855 ą 13% +52.4% 12556929 ą 17% numa-meminfo.node1.Inactive > > 8239712 ą 13% +52.4% 12556853 ą 17% numa-meminfo.node1.Inactive(anon) > > 761944 ą 18% +34.6% 1025906 ą 14% numa-meminfo.node1.Mapped > > 11679628 ą 11% +31.2% 15322841 ą 14% numa-meminfo.node1.MemUsed > > 9874 ą 62% +104.6% 20200 ą 36% numa-meminfo.node1.PageTables > > 0.74 -4.2% 0.71 perf-stat.i.MPKI > > 1.245e+11 +2.3% 1.274e+11 perf-stat.i.branch-instructions > > 0.37 -0.0 0.35 perf-stat.i.branch-miss-rate% > > 4.359e+08 -2.1% 4.265e+08 perf-stat.i.branch-misses > > 4.672e+08 -2.6% 4.548e+08 perf-stat.i.cache-misses > > 7.276e+08 -2.7% 7.082e+08 perf-stat.i.cache-references > > 1.00 -1.6% 0.98 perf-stat.i.cpi > > 1364 +2.9% 1404 perf-stat.i.cycles-between-cache-misses > > 6.392e+11 +1.7% 6.499e+11 perf-stat.i.instructions > > 1.00 +1.6% 1.02 perf-stat.i.ipc > > 0.74 -4.3% 0.71 perf-stat.overall.MPKI > > 0.35 -0.0 0.33 perf-stat.overall.branch-miss-rate% > > 1.00 -1.6% 0.99 perf-stat.overall.cpi > > 1356 +2.9% 1395 perf-stat.overall.cycles-between-cache-misses > > 1.00 +1.6% 1.01 perf-stat.overall.ipc > > 1.209e+11 +1.9% 1.232e+11 perf-stat.ps.branch-instructions > > 4.188e+08 -2.6% 4.077e+08 perf-stat.ps.branch-misses > > 4.585e+08 -3.1% 4.441e+08 perf-stat.ps.cache-misses > > 7.124e+08 -3.1% 6.901e+08 perf-stat.ps.cache-references > > 10321 -2.6% 10053 perf-stat.ps.context-switches > > > > > > > > > > > > Disclaimer: > > Results have been estimated based on internal Intel analysis and are provided > > for informational purposes only. Any difference in system hardware or software > > design or configuration may affect actual performance. > > > > > > -- > > 0-DAY CI Kernel Test Service > > https://github.com/intel/lkp-tests/wiki > > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression 2024-08-06 1:44 ` Oliver Sang @ 2024-08-06 14:54 ` Jeff Xu 0 siblings, 0 replies; 29+ messages in thread From: Jeff Xu @ 2024-08-06 14:54 UTC (permalink / raw) To: Oliver Sang Cc: Jeff Xu, oe-lkp, lkp, linux-kernel, Andrew Morton, Kees Cook, Liam R. Howlett, Pedro Falcato, Dave Hansen, Greg Kroah-Hartman, Guenter Roeck, Jann Horn, Jonathan Corbet, Jorge Lucangeli Obes, Linus Torvalds, Matthew Wilcox, Muhammad Usama Anjum, Stephen Röttger, Suren Baghdasaryan, Amer Al Shanawany, Javier Carrasco, Shuah Khan, linux-api, linux-mm, ying.huang, feng.tang, fengwei.yin On Mon, Aug 5, 2024 at 6:44 PM Oliver Sang <oliver.sang@intel.com> wrote: > > hi, Jeff, > > On Mon, Aug 05, 2024 at 09:58:33AM -0700, Jeff Xu wrote: > > On Sun, Aug 4, 2024 at 1:59 AM kernel test robot <oliver.sang@intel.com> wrote: > > > > > > > > > > > > Hello, > > > > > > kernel test robot noticed a -4.4% regression of stress-ng.pagemove.page_remaps_per_sec on: > > > > > > > > > commit: 8be7258aad44b5e25977a98db136f677fa6f4370 ("mseal: add mseal syscall") > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master > > > > > > testcase: stress-ng > > > test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory > > > parameters: > > > > > > nr_threads: 100% > > > testtime: 60s > > > test: pagemove > > > cpufreq_governor: performance > > > > > > > > > In addition to that, the commit also has significant impact on the following tests: > > > > > > +------------------+---------------------------------------------------------------------------------------------+ > > > | testcase: change | stress-ng: stress-ng.pkey.ops_per_sec -3.6% regression | > > > | test machine | 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory | > > > | test parameters | cpufreq_governor=performance | > > > | | nr_threads=100% | > > > | | test=pkey | > > > | | testtime=60s | > > > +------------------+---------------------------------------------------------------------------------------------+ > > > > > > > > > If you fix the issue in a separate patch/commit (i.e. not just a new version of > > > the same patch/commit), kindly add following tags > > > | Reported-by: kernel test robot <oliver.sang@intel.com> > > > | Closes: https://lore.kernel.org/oe-lkp/202408041602.caa0372-oliver.sang@intel.com > > > > > > > > > Details are as below: > > > --------------------------------------------------------------------------------------------------> > > > > > > > > > The kernel config and materials to reproduce are available at: > > > https://download.01.org/0day-ci/archive/20240804/202408041602.caa0372-oliver.sang@intel.com > > > > > There is an error when I try to reproduce the test: > > what's your os? we support some distributions > https://github.com/intel/lkp-tests?tab=readme-ov-file#supported-distributions > > > > > bin/lkp install job.yaml > > > > -------------------------------------------------------- > > Some packages could not be installed. This may mean that you have > > requested an impossible situation or if you are using the unstable > > distribution that some required packages have not yet been created > > or been moved out of Incoming. > > The following information may help to resolve the situation: > > > > The following packages have unmet dependencies: > > libdw1 : Depends: libelf1 (= 0.190-1+b1) > > libdw1t64 : Breaks: libdw1 (< 0.191-2) > > E: Unable to correct problems, you have held broken packages. > > Cannot install some packages of perf-c2c depends > > ----------------------------------------------------------------------------------------- > > > > And where is stress-ng.pagemove.page_remaps_per_sec test implemented, > > is that part of lkp-tests ? > > stress-ng is in https://github.com/ColinIanKing/stress-ng > I will try this route first. Thanks -Jeff > > > > Thanks > > -Jeff > > > > > ========================================================================================= > > > compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: > > > gcc-13/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/pagemove/stress-ng/60s > > > > > > commit: > > > ff388fe5c4 ("mseal: wire up mseal syscall") > > > 8be7258aad ("mseal: add mseal syscall") > > > > > > ff388fe5c481d39c 8be7258aad44b5e25977a98db13 > > > ---------------- --------------------------- > > > %stddev %change %stddev > > > \ | \ > > > 41625945 -4.3% 39842322 proc-vmstat.numa_hit > > > 41559175 -4.3% 39774160 proc-vmstat.numa_local > > > 77484314 -4.4% 74105555 proc-vmstat.pgalloc_normal > > > 77205752 -4.4% 73826672 proc-vmstat.pgfree > > > 18361466 -4.2% 17596652 stress-ng.pagemove.ops > > > 306014 -4.2% 293262 stress-ng.pagemove.ops_per_sec > > > 205312 -4.4% 196176 stress-ng.pagemove.page_remaps_per_sec > > > 4961 +1.0% 5013 stress-ng.time.percent_of_cpu_this_job_got > > > 2917 +1.2% 2952 stress-ng.time.system_time > > > 1.07 -6.6% 1.00 perf-stat.i.MPKI > > > 3.354e+10 +3.5% 3.473e+10 perf-stat.i.branch-instructions > > > 1.795e+08 -4.2% 1.719e+08 perf-stat.i.cache-misses > > > 2.376e+08 -4.1% 2.279e+08 perf-stat.i.cache-references > > > 1.13 -3.0% 1.10 perf-stat.i.cpi > > > 1077 +4.3% 1124 perf-stat.i.cycles-between-cache-misses > > > 1.717e+11 +2.7% 1.762e+11 perf-stat.i.instructions > > > 0.88 +3.1% 0.91 perf-stat.i.ipc > > > 1.05 -6.8% 0.97 perf-stat.overall.MPKI > > > 0.25 ą 2% -0.0 0.24 perf-stat.overall.branch-miss-rate% > > > 1.13 -3.0% 1.10 perf-stat.overall.cpi > > > 1084 +4.0% 1127 perf-stat.overall.cycles-between-cache-misses > > > 0.88 +3.1% 0.91 perf-stat.overall.ipc > > > 3.298e+10 +3.5% 3.415e+10 perf-stat.ps.branch-instructions > > > 1.764e+08 -4.3% 1.689e+08 perf-stat.ps.cache-misses > > > 2.336e+08 -4.1% 2.24e+08 perf-stat.ps.cache-references > > > 194.57 -2.4% 189.96 ą 2% perf-stat.ps.cpu-migrations > > > 1.688e+11 +2.7% 1.733e+11 perf-stat.ps.instructions > > > 1.036e+13 +3.0% 1.068e+13 perf-stat.total.instructions > > > 75.12 -1.9 73.22 perf-profile.calltrace.cycles-pp.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > > > 36.84 -1.6 35.29 perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 > > > 24.90 -1.2 23.72 perf-profile.calltrace.cycles-pp.copy_vma.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > > > 19.89 -0.9 18.98 perf-profile.calltrace.cycles-pp.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > > > 10.56 ą 2% -0.8 9.78 ą 2% perf-profile.calltrace.cycles-pp.rcu_core.handle_softirqs.run_ksoftirqd.smpboot_thread_fn.kthread > > > 10.56 ą 2% -0.8 9.79 ą 2% perf-profile.calltrace.cycles-pp.handle_softirqs.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork > > > 10.56 ą 2% -0.8 9.79 ą 2% perf-profile.calltrace.cycles-pp.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm > > > 10.57 ą 2% -0.8 9.80 ą 2% perf-profile.calltrace.cycles-pp.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm > > > 10.52 ą 2% -0.8 9.75 ą 2% perf-profile.calltrace.cycles-pp.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd.smpboot_thread_fn > > > 10.62 ą 2% -0.8 9.85 ą 2% perf-profile.calltrace.cycles-pp.kthread.ret_from_fork.ret_from_fork_asm > > > 10.62 ą 2% -0.8 9.85 ą 2% perf-profile.calltrace.cycles-pp.ret_from_fork.ret_from_fork_asm > > > 10.62 ą 2% -0.8 9.85 ą 2% perf-profile.calltrace.cycles-pp.ret_from_fork_asm > > > 14.75 -0.7 14.07 perf-profile.calltrace.cycles-pp.vma_merge.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > > > 1.50 -0.6 0.94 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 > > > 5.88 ą 2% -0.4 5.47 ą 2% perf-profile.calltrace.cycles-pp.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd > > > 7.80 -0.3 7.47 perf-profile.calltrace.cycles-pp.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > > > 4.55 ą 2% -0.3 4.24 ą 2% perf-profile.calltrace.cycles-pp.__slab_free.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs > > > 6.76 -0.3 6.45 perf-profile.calltrace.cycles-pp.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma > > > 6.15 -0.3 5.86 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap > > > 8.22 -0.3 7.93 perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > > > 6.12 -0.3 5.87 perf-profile.calltrace.cycles-pp.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma > > > 5.74 -0.2 5.50 perf-profile.calltrace.cycles-pp.move_ptes.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64 > > > 3.16 ą 2% -0.2 2.94 perf-profile.calltrace.cycles-pp.vm_area_free_rcu_cb.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd > > > 5.50 -0.2 5.28 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap > > > 1.36 -0.2 1.14 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap > > > 5.15 -0.2 4.94 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap > > > 5.51 -0.2 5.31 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_merge.copy_vma.move_vma.__do_sys_mremap > > > 5.16 -0.2 4.97 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma.move_vma > > > 2.24 -0.2 2.05 perf-profile.calltrace.cycles-pp.find_vma_prev.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > > > 2.60 ą 2% -0.2 2.42 ą 2% perf-profile.calltrace.cycles-pp.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core.handle_softirqs > > > 4.67 -0.2 4.49 perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma > > > 3.41 -0.2 3.23 perf-profile.calltrace.cycles-pp.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma > > > 3.00 -0.2 2.83 ą 2% perf-profile.calltrace.cycles-pp.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma > > > 0.96 -0.2 0.80 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.do_munmap.mremap_to > > > 4.04 -0.2 3.88 perf-profile.calltrace.cycles-pp.vm_area_dup.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > > > 3.20 ą 2% -0.2 3.04 ą 2% perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap > > > 3.53 -0.1 3.38 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma.__do_sys_mremap > > > 3.40 -0.1 3.26 perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma.__do_sys_mremap > > > 2.20 ą 2% -0.1 2.06 ą 2% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap > > > 1.84 ą 3% -0.1 1.71 ą 3% perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.anon_vma_clone.__split_vma.do_vmi_align_munmap > > > 1.78 ą 2% -0.1 1.65 ą 3% perf-profile.calltrace.cycles-pp.vma_prepare.vma_merge.copy_vma.move_vma.__do_sys_mremap > > > 2.69 -0.1 2.56 perf-profile.calltrace.cycles-pp.unlink_anon_vmas.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap > > > 1.78 ą 2% -0.1 1.66 ą 2% perf-profile.calltrace.cycles-pp.__slab_free.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core > > > 1.36 ą 2% -0.1 1.23 ą 2% perf-profile.calltrace.cycles-pp.rcu_cblist_dequeue.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd > > > 0.95 -0.1 0.83 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap > > > 3.29 -0.1 3.17 perf-profile.calltrace.cycles-pp.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > > > 2.08 -0.1 1.96 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_merge.copy_vma.move_vma.__do_sys_mremap > > > 1.43 ą 3% -0.1 1.32 ą 3% perf-profile.calltrace.cycles-pp.down_write.vma_prepare.vma_merge.copy_vma.move_vma > > > 2.21 -0.1 2.10 perf-profile.calltrace.cycles-pp.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma > > > 2.47 -0.1 2.36 perf-profile.calltrace.cycles-pp.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma > > > 2.21 -0.1 2.12 perf-profile.calltrace.cycles-pp.native_flush_tlb_one_user.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables > > > 1.41 -0.1 1.32 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap > > > 1.26 -0.1 1.18 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap > > > 1.82 -0.1 1.75 perf-profile.calltrace.cycles-pp.vma_link.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > > > 0.71 -0.1 0.63 perf-profile.calltrace.cycles-pp.mas_prev.vma_merge.copy_vma.move_vma.__do_sys_mremap > > > 1.29 -0.1 1.22 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma > > > 0.61 -0.1 0.54 perf-profile.calltrace.cycles-pp.mas_prev_setup.mas_prev.vma_merge.copy_vma.move_vma > > > 1.36 -0.1 1.29 perf-profile.calltrace.cycles-pp.kmem_cache_free.unlink_anon_vmas.free_pgtables.unmap_region.do_vmi_align_munmap > > > 1.40 -0.1 1.33 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma.move_vma > > > 0.70 -0.1 0.64 perf-profile.calltrace.cycles-pp.mas_walk.find_vma_prev.copy_vma.move_vma.__do_sys_mremap > > > 1.23 -0.1 1.17 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma > > > 1.66 -0.1 1.60 perf-profile.calltrace.cycles-pp.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > > > 1.16 -0.1 1.10 perf-profile.calltrace.cycles-pp.anon_vma_clone.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > > > 0.96 -0.1 0.90 perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.unlink_anon_vmas.free_pgtables.unmap_region > > > 1.14 -0.1 1.08 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap > > > 0.79 -0.1 0.74 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma > > > 1.04 -0.1 1.00 perf-profile.calltrace.cycles-pp.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > > > 0.58 -0.0 0.53 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.mremap > > > 0.61 -0.0 0.56 perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.vm_area_free_rcu_cb.rcu_do_batch.rcu_core > > > 0.56 -0.0 0.52 perf-profile.calltrace.cycles-pp.mas_find.find_vma_prev.copy_vma.move_vma.__do_sys_mremap > > > 0.57 -0.0 0.53 ą 2% perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs > > > 0.78 -0.0 0.74 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge > > > 0.88 -0.0 0.84 perf-profile.calltrace.cycles-pp.mtree_load.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64 > > > 0.70 -0.0 0.66 perf-profile.calltrace.cycles-pp.__call_rcu_common.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > > > 0.68 -0.0 0.64 perf-profile.calltrace.cycles-pp.percpu_counter_add_batch.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > > > 0.68 -0.0 0.64 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.mremap > > > 0.97 -0.0 0.93 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > > > 1.11 -0.0 1.08 perf-profile.calltrace.cycles-pp.clear_bhb_loop.mremap > > > 0.75 -0.0 0.72 perf-profile.calltrace.cycles-pp.allocate_slab.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma > > > 0.74 -0.0 0.71 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.anon_vma_clone.copy_vma.move_vma.__do_sys_mremap > > > 0.60 ą 2% -0.0 0.57 perf-profile.calltrace.cycles-pp.security_mmap_addr.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64 > > > 0.67 ą 2% -0.0 0.64 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma > > > 0.82 -0.0 0.79 perf-profile.calltrace.cycles-pp.mtree_load.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > > > 0.63 -0.0 0.60 perf-profile.calltrace.cycles-pp.mas_prev_slot.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > > > 0.99 -0.0 0.96 perf-profile.calltrace.cycles-pp.mt_find.vma_merge.copy_vma.move_vma.__do_sys_mremap > > > 0.62 ą 2% -0.0 0.59 perf-profile.calltrace.cycles-pp.get_old_pud.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64 > > > 0.87 -0.0 0.84 perf-profile.calltrace.cycles-pp.userfaultfd_unmap_complete.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > > > 0.78 -0.0 0.75 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_link.copy_vma.move_vma.__do_sys_mremap > > > 0.64 -0.0 0.62 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_link.copy_vma.move_vma.__do_sys_mremap > > > 0.90 -0.0 0.87 perf-profile.calltrace.cycles-pp.mtree_load.vma_merge.copy_vma.move_vma.__do_sys_mremap > > > 0.54 -0.0 0.52 perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.move_ptes.move_page_tables.move_vma.__do_sys_mremap > > > 1.04 +0.0 1.08 perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region > > > 0.76 +0.1 0.83 perf-profile.calltrace.cycles-pp.__madvise > > > 0.63 +0.1 0.70 perf-profile.calltrace.cycles-pp.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise > > > 0.62 +0.1 0.70 perf-profile.calltrace.cycles-pp.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise > > > 0.66 +0.1 0.74 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__madvise > > > 0.66 +0.1 0.74 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise > > > 87.74 +0.7 88.45 perf-profile.calltrace.cycles-pp.mremap > > > 0.00 +0.9 0.86 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.do_munmap > > > 0.00 +0.9 0.86 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.mremap_to.__do_sys_mremap > > > 84.88 +0.9 85.77 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.mremap > > > 84.73 +0.9 85.62 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > > > 0.00 +0.9 0.92 ą 2% perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.move_vma > > > 83.84 +0.9 84.78 perf-profile.calltrace.cycles-pp.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > > > 0.00 +1.1 1.06 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64 > > > 0.00 +1.2 1.21 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to > > > 2.07 +1.5 3.55 perf-profile.calltrace.cycles-pp.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > > > 1.58 +1.5 3.07 perf-profile.calltrace.cycles-pp.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64 > > > 0.00 +1.5 1.52 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap > > > 0.00 +1.6 1.57 perf-profile.calltrace.cycles-pp.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > > > 0.00 +1.7 1.72 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap > > > 0.00 +2.0 2.01 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 > > > 5.39 +2.9 8.32 perf-profile.calltrace.cycles-pp.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > > > 75.29 -1.9 73.37 perf-profile.children.cycles-pp.move_vma > > > 37.06 -1.6 35.50 perf-profile.children.cycles-pp.do_vmi_align_munmap > > > 24.98 -1.2 23.80 perf-profile.children.cycles-pp.copy_vma > > > 19.99 -1.0 19.02 perf-profile.children.cycles-pp.handle_softirqs > > > 19.97 -1.0 19.00 perf-profile.children.cycles-pp.rcu_core > > > 19.95 -1.0 18.98 perf-profile.children.cycles-pp.rcu_do_batch > > > 19.98 -0.9 19.06 perf-profile.children.cycles-pp.__split_vma > > > 17.55 -0.8 16.76 perf-profile.children.cycles-pp.kmem_cache_free > > > 10.56 ą 2% -0.8 9.79 ą 2% perf-profile.children.cycles-pp.run_ksoftirqd > > > 10.57 ą 2% -0.8 9.80 ą 2% perf-profile.children.cycles-pp.smpboot_thread_fn > > > 15.38 -0.8 14.62 perf-profile.children.cycles-pp.kmem_cache_alloc_noprof > > > 10.62 ą 2% -0.8 9.85 ą 2% perf-profile.children.cycles-pp.kthread > > > 10.62 ą 2% -0.8 9.86 ą 2% perf-profile.children.cycles-pp.ret_from_fork > > > 10.62 ą 2% -0.8 9.86 ą 2% perf-profile.children.cycles-pp.ret_from_fork_asm > > > 15.14 -0.7 14.44 perf-profile.children.cycles-pp.vma_merge > > > 12.08 -0.5 11.55 perf-profile.children.cycles-pp.__slab_free > > > 12.11 -0.5 11.62 perf-profile.children.cycles-pp.mas_wr_store_entry > > > 10.86 -0.5 10.39 perf-profile.children.cycles-pp.vm_area_dup > > > 11.89 -0.5 11.44 perf-profile.children.cycles-pp.mas_store_prealloc > > > 8.49 -0.4 8.06 perf-profile.children.cycles-pp.__memcg_slab_post_alloc_hook > > > 9.88 -0.4 9.49 perf-profile.children.cycles-pp.mas_wr_node_store > > > 7.91 -0.3 7.58 perf-profile.children.cycles-pp.move_page_tables > > > 6.06 -0.3 5.78 perf-profile.children.cycles-pp.vm_area_free_rcu_cb > > > 8.28 -0.3 8.00 perf-profile.children.cycles-pp.unmap_region > > > 6.69 -0.3 6.42 perf-profile.children.cycles-pp.vma_complete > > > 5.06 -0.3 4.80 perf-profile.children.cycles-pp.mas_preallocate > > > 5.82 -0.2 5.57 perf-profile.children.cycles-pp.move_ptes > > > 4.24 -0.2 4.01 perf-profile.children.cycles-pp.anon_vma_clone > > > 3.50 -0.2 3.30 perf-profile.children.cycles-pp.down_write > > > 2.44 -0.2 2.25 perf-profile.children.cycles-pp.find_vma_prev > > > 3.46 -0.2 3.28 perf-profile.children.cycles-pp.___slab_alloc > > > 3.45 -0.2 3.27 perf-profile.children.cycles-pp.free_pgtables > > > 2.54 -0.2 2.37 perf-profile.children.cycles-pp.rcu_cblist_dequeue > > > 3.35 -0.2 3.18 perf-profile.children.cycles-pp.__memcg_slab_free_hook > > > 2.93 -0.2 2.78 perf-profile.children.cycles-pp.mas_alloc_nodes > > > 2.28 ą 2% -0.2 2.12 ą 2% perf-profile.children.cycles-pp.vma_prepare > > > 3.46 -0.1 3.32 perf-profile.children.cycles-pp.flush_tlb_mm_range > > > 3.41 -0.1 3.27 ą 2% perf-profile.children.cycles-pp.mod_objcg_state > > > 2.76 -0.1 2.63 perf-profile.children.cycles-pp.unlink_anon_vmas > > > 3.41 -0.1 3.28 perf-profile.children.cycles-pp.mas_store_gfp > > > 2.21 -0.1 2.09 perf-profile.children.cycles-pp.__cond_resched > > > 2.04 -0.1 1.94 perf-profile.children.cycles-pp.allocate_slab > > > 2.10 -0.1 2.00 perf-profile.children.cycles-pp.__call_rcu_common > > > 2.51 -0.1 2.40 perf-profile.children.cycles-pp.flush_tlb_func > > > 1.04 -0.1 0.94 perf-profile.children.cycles-pp.mas_prev > > > 2.71 -0.1 2.61 perf-profile.children.cycles-pp.mtree_load > > > 2.23 -0.1 2.14 perf-profile.children.cycles-pp.native_flush_tlb_one_user > > > 0.22 ą 5% -0.1 0.13 ą 13% perf-profile.children.cycles-pp.vm_stat_account > > > 0.95 -0.1 0.87 perf-profile.children.cycles-pp.mas_prev_setup > > > 1.65 -0.1 1.57 perf-profile.children.cycles-pp.mas_wr_walk > > > 1.84 -0.1 1.76 perf-profile.children.cycles-pp.up_write > > > 1.27 -0.1 1.20 perf-profile.children.cycles-pp.mas_prev_slot > > > 1.84 -0.1 1.77 perf-profile.children.cycles-pp.vma_link > > > 1.39 -0.1 1.32 perf-profile.children.cycles-pp.shuffle_freelist > > > 0.96 -0.1 0.90 ą 2% perf-profile.children.cycles-pp.rcu_all_qs > > > 0.86 -0.1 0.80 perf-profile.children.cycles-pp._raw_spin_lock_irqsave > > > 1.70 -0.1 1.64 perf-profile.children.cycles-pp.__get_unmapped_area > > > 0.34 ą 3% -0.1 0.29 ą 5% perf-profile.children.cycles-pp.security_vm_enough_memory_mm > > > 0.60 -0.0 0.55 perf-profile.children.cycles-pp.entry_SYSCALL_64 > > > 0.92 -0.0 0.87 perf-profile.children.cycles-pp.percpu_counter_add_batch > > > 1.07 -0.0 1.02 perf-profile.children.cycles-pp.vma_to_resize > > > 1.59 -0.0 1.54 perf-profile.children.cycles-pp.mas_update_gap > > > 0.44 ą 2% -0.0 0.40 ą 2% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath > > > 0.70 -0.0 0.66 perf-profile.children.cycles-pp.syscall_return_via_sysret > > > 1.13 -0.0 1.09 perf-profile.children.cycles-pp.mt_find > > > 0.20 ą 6% -0.0 0.17 ą 9% perf-profile.children.cycles-pp.cap_vm_enough_memory > > > 0.99 -0.0 0.95 perf-profile.children.cycles-pp.mas_pop_node > > > 0.63 ą 2% -0.0 0.59 perf-profile.children.cycles-pp.security_mmap_addr > > > 0.62 -0.0 0.59 perf-profile.children.cycles-pp.__put_partials > > > 1.17 -0.0 1.14 perf-profile.children.cycles-pp.clear_bhb_loop > > > 0.46 -0.0 0.43 ą 2% perf-profile.children.cycles-pp.__alloc_pages_noprof > > > 0.44 -0.0 0.41 ą 2% perf-profile.children.cycles-pp.get_page_from_freelist > > > 0.90 -0.0 0.87 perf-profile.children.cycles-pp.userfaultfd_unmap_complete > > > 0.64 ą 2% -0.0 0.62 perf-profile.children.cycles-pp.get_old_pud > > > 1.07 -0.0 1.05 perf-profile.children.cycles-pp.mas_leaf_max_gap > > > 0.22 ą 3% -0.0 0.20 ą 2% perf-profile.children.cycles-pp.__rmqueue_pcplist > > > 0.55 -0.0 0.53 perf-profile.children.cycles-pp.refill_obj_stock > > > 0.25 -0.0 0.23 ą 3% perf-profile.children.cycles-pp.rmqueue > > > 0.48 -0.0 0.45 perf-profile.children.cycles-pp.mremap_userfaultfd_prep > > > 0.33 -0.0 0.30 perf-profile.children.cycles-pp.free_unref_page > > > 0.46 -0.0 0.44 perf-profile.children.cycles-pp.setup_object > > > 0.21 ą 3% -0.0 0.19 ą 2% perf-profile.children.cycles-pp.rmqueue_bulk > > > 0.31 ą 3% -0.0 0.29 perf-profile.children.cycles-pp.__vm_enough_memory > > > 0.40 -0.0 0.38 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack > > > 0.36 -0.0 0.35 perf-profile.children.cycles-pp.madvise_vma_behavior > > > 0.54 -0.0 0.53 ą 2% perf-profile.children.cycles-pp.mas_wr_end_piv > > > 0.46 -0.0 0.44 ą 2% perf-profile.children.cycles-pp.rcu_segcblist_enqueue > > > 0.34 -0.0 0.32 ą 2% perf-profile.children.cycles-pp.mas_destroy > > > 0.28 -0.0 0.26 ą 3% perf-profile.children.cycles-pp.mas_wr_store_setup > > > 0.30 -0.0 0.28 perf-profile.children.cycles-pp.pte_offset_map_nolock > > > 0.19 -0.0 0.18 ą 2% perf-profile.children.cycles-pp.__thp_vma_allowable_orders > > > 0.08 ą 4% -0.0 0.07 perf-profile.children.cycles-pp.ksm_madvise > > > 0.17 -0.0 0.16 perf-profile.children.cycles-pp.get_any_partial > > > 0.08 -0.0 0.07 perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare > > > 0.45 +0.0 0.47 perf-profile.children.cycles-pp._raw_spin_lock > > > 1.10 +0.0 1.14 perf-profile.children.cycles-pp.zap_pte_range > > > 0.78 +0.1 0.85 perf-profile.children.cycles-pp.__madvise > > > 0.63 +0.1 0.70 perf-profile.children.cycles-pp.__x64_sys_madvise > > > 0.62 +0.1 0.70 perf-profile.children.cycles-pp.do_madvise > > > 0.00 +0.1 0.09 ą 4% perf-profile.children.cycles-pp.can_modify_mm_madv > > > 1.32 +0.1 1.46 perf-profile.children.cycles-pp.mas_next_slot > > > 88.13 +0.7 88.83 perf-profile.children.cycles-pp.mremap > > > 83.94 +0.9 84.88 perf-profile.children.cycles-pp.__do_sys_mremap > > > 86.06 +0.9 87.00 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe > > > 85.56 +1.0 86.54 perf-profile.children.cycles-pp.do_syscall_64 > > > 40.49 +1.4 41.90 perf-profile.children.cycles-pp.do_vmi_munmap > > > 2.10 +1.5 3.57 perf-profile.children.cycles-pp.do_munmap > > > 3.62 +2.3 5.90 perf-profile.children.cycles-pp.mas_walk > > > 5.44 +2.9 8.38 perf-profile.children.cycles-pp.mremap_to > > > 5.30 +3.1 8.39 perf-profile.children.cycles-pp.mas_find > > > 0.00 +5.4 5.40 perf-profile.children.cycles-pp.can_modify_mm > > > 11.46 -0.5 10.96 perf-profile.self.cycles-pp.__slab_free > > > 4.30 -0.2 4.08 perf-profile.self.cycles-pp.__memcg_slab_post_alloc_hook > > > 2.51 -0.2 2.34 perf-profile.self.cycles-pp.rcu_cblist_dequeue > > > 2.41 ą 2% -0.2 2.25 perf-profile.self.cycles-pp.down_write > > > 2.21 -0.1 2.11 perf-profile.self.cycles-pp.native_flush_tlb_one_user > > > 2.37 -0.1 2.28 perf-profile.self.cycles-pp.mtree_load > > > 1.60 -0.1 1.51 perf-profile.self.cycles-pp.__memcg_slab_free_hook > > > 0.18 ą 3% -0.1 0.10 ą 15% perf-profile.self.cycles-pp.vm_stat_account > > > 1.25 -0.1 1.18 perf-profile.self.cycles-pp.move_vma > > > 1.76 -0.1 1.69 perf-profile.self.cycles-pp.mod_objcg_state > > > 1.42 -0.1 1.35 ą 2% perf-profile.self.cycles-pp.__call_rcu_common > > > 1.41 -0.1 1.34 perf-profile.self.cycles-pp.mas_wr_walk > > > 1.52 -0.1 1.46 perf-profile.self.cycles-pp.up_write > > > 1.02 -0.1 0.95 perf-profile.self.cycles-pp.mas_prev_slot > > > 0.96 -0.1 0.90 ą 2% perf-profile.self.cycles-pp.vm_area_free_rcu_cb > > > 1.50 -0.1 1.45 perf-profile.self.cycles-pp.kmem_cache_free > > > 0.69 ą 3% -0.1 0.64 ą 2% perf-profile.self.cycles-pp.rcu_all_qs > > > 1.14 ą 2% -0.1 1.09 perf-profile.self.cycles-pp.shuffle_freelist > > > 1.10 -0.1 1.05 perf-profile.self.cycles-pp.__cond_resched > > > 1.40 -0.0 1.35 perf-profile.self.cycles-pp.do_vmi_align_munmap > > > 0.99 -0.0 0.94 perf-profile.self.cycles-pp.mas_preallocate > > > 0.88 -0.0 0.83 perf-profile.self.cycles-pp.___slab_alloc > > > 0.55 -0.0 0.50 perf-profile.self.cycles-pp.mremap_to > > > 0.98 -0.0 0.93 perf-profile.self.cycles-pp.move_ptes > > > 0.78 -0.0 0.74 perf-profile.self.cycles-pp.percpu_counter_add_batch > > > 0.21 ą 2% -0.0 0.18 ą 2% perf-profile.self.cycles-pp.entry_SYSCALL_64 > > > 0.44 ą 2% -0.0 0.40 ą 2% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath > > > 0.92 -0.0 0.89 perf-profile.self.cycles-pp.mas_store_gfp > > > 0.86 -0.0 0.82 perf-profile.self.cycles-pp.mas_pop_node > > > 0.50 -0.0 0.46 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe > > > 1.15 -0.0 1.12 perf-profile.self.cycles-pp.clear_bhb_loop > > > 1.14 -0.0 1.11 perf-profile.self.cycles-pp.vma_merge > > > 0.66 -0.0 0.63 perf-profile.self.cycles-pp.__split_vma > > > 0.16 ą 6% -0.0 0.13 ą 7% perf-profile.self.cycles-pp.cap_vm_enough_memory > > > 0.82 -0.0 0.79 perf-profile.self.cycles-pp.mas_wr_store_entry > > > 0.54 ą 2% -0.0 0.52 perf-profile.self.cycles-pp.get_old_pud > > > 0.43 -0.0 0.40 perf-profile.self.cycles-pp.do_munmap > > > 0.51 ą 2% -0.0 0.48 ą 2% perf-profile.self.cycles-pp.security_mmap_addr > > > 0.50 -0.0 0.48 perf-profile.self.cycles-pp.refill_obj_stock > > > 0.24 -0.0 0.22 perf-profile.self.cycles-pp.mas_prev > > > 0.71 -0.0 0.69 perf-profile.self.cycles-pp.unmap_page_range > > > 0.48 -0.0 0.45 perf-profile.self.cycles-pp.find_vma_prev > > > 0.42 -0.0 0.40 perf-profile.self.cycles-pp._raw_spin_lock_irqsave > > > 0.66 -0.0 0.64 perf-profile.self.cycles-pp.mas_store_prealloc > > > 0.31 -0.0 0.29 perf-profile.self.cycles-pp.mas_prev_setup > > > 0.43 -0.0 0.41 perf-profile.self.cycles-pp.mas_wr_end_piv > > > 0.78 -0.0 0.76 perf-profile.self.cycles-pp.userfaultfd_unmap_complete > > > 0.28 -0.0 0.26 ą 2% perf-profile.self.cycles-pp.mas_put_in_tree > > > 0.42 -0.0 0.40 perf-profile.self.cycles-pp.mremap_userfaultfd_prep > > > 0.28 -0.0 0.26 perf-profile.self.cycles-pp.free_pgtables > > > 0.39 -0.0 0.37 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack > > > 0.30 ą 2% -0.0 0.28 perf-profile.self.cycles-pp.zap_pmd_range > > > 0.32 -0.0 0.31 perf-profile.self.cycles-pp.unmap_vmas > > > 0.21 -0.0 0.20 perf-profile.self.cycles-pp.__get_unmapped_area > > > 0.18 ą 2% -0.0 0.17 ą 2% perf-profile.self.cycles-pp.lru_add_drain_cpu > > > 0.06 -0.0 0.05 perf-profile.self.cycles-pp.ksm_madvise > > > 0.45 +0.0 0.46 perf-profile.self.cycles-pp.do_vmi_munmap > > > 0.37 +0.0 0.39 perf-profile.self.cycles-pp._raw_spin_lock > > > 1.06 +0.1 1.18 perf-profile.self.cycles-pp.mas_next_slot > > > 1.50 +0.5 1.97 perf-profile.self.cycles-pp.mas_find > > > 0.00 +1.4 1.35 perf-profile.self.cycles-pp.can_modify_mm > > > 3.13 +2.0 5.13 perf-profile.self.cycles-pp.mas_walk > > > > > > > > > *************************************************************************************************** > > > lkp-spr-r02: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory > > > ========================================================================================= > > > compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: > > > gcc-13/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-spr-r02/pkey/stress-ng/60s > > > > > > commit: > > > ff388fe5c4 ("mseal: wire up mseal syscall") > > > 8be7258aad ("mseal: add mseal syscall") > > > > > > ff388fe5c481d39c 8be7258aad44b5e25977a98db13 > > > ---------------- --------------------------- > > > %stddev %change %stddev > > > \ | \ > > > 10539 -2.5% 10273 vmstat.system.cs > > > 0.28 ą 5% -20.1% 0.22 ą 7% sched_debug.cfs_rq:/.h_nr_running.stddev > > > 1419 ą 7% -15.3% 1202 ą 6% sched_debug.cfs_rq:/.util_avg.max > > > 0.28 ą 6% -18.4% 0.23 ą 8% sched_debug.cpu.nr_running.stddev > > > 8.736e+08 -3.6% 8.423e+08 stress-ng.pkey.ops > > > 14560560 -3.6% 14038795 stress-ng.pkey.ops_per_sec > > > 770.39 ą 4% -5.0% 732.04 stress-ng.time.user_time > > > 244657 ą 3% +5.8% 258782 ą 3% proc-vmstat.nr_slab_unreclaimable > > > 73133541 -2.1% 71588873 proc-vmstat.numa_hit > > > 72873579 -2.1% 71357274 proc-vmstat.numa_local > > > 1.842e+08 -2.5% 1.796e+08 proc-vmstat.pgalloc_normal > > > 1.767e+08 -2.8% 1.717e+08 proc-vmstat.pgfree > > > 1345346 ą 40% -73.1% 362064 ą124% numa-vmstat.node0.nr_inactive_anon > > > 1345340 ą 40% -73.1% 362062 ą124% numa-vmstat.node0.nr_zone_inactive_anon > > > 2420830 ą 14% +35.1% 3270248 ą 16% numa-vmstat.node1.nr_file_pages > > > 2067871 ą 13% +51.5% 3132982 ą 17% numa-vmstat.node1.nr_inactive_anon > > > 191406 ą 17% +33.6% 255808 ą 14% numa-vmstat.node1.nr_mapped > > > 2452 ą 61% +104.4% 5012 ą 35% numa-vmstat.node1.nr_page_table_pages > > > 2067853 ą 13% +51.5% 3132966 ą 17% numa-vmstat.node1.nr_zone_inactive_anon > > > 5379238 ą 40% -73.0% 1453605 ą123% numa-meminfo.node0.Inactive > > > 5379166 ą 40% -73.0% 1453462 ą123% numa-meminfo.node0.Inactive(anon) > > > 8741077 ą 22% -36.7% 5531290 ą 28% numa-meminfo.node0.MemUsed > > > 9651902 ą 13% +35.8% 13105318 ą 16% numa-meminfo.node1.FilePages > > > 8239855 ą 13% +52.4% 12556929 ą 17% numa-meminfo.node1.Inactive > > > 8239712 ą 13% +52.4% 12556853 ą 17% numa-meminfo.node1.Inactive(anon) > > > 761944 ą 18% +34.6% 1025906 ą 14% numa-meminfo.node1.Mapped > > > 11679628 ą 11% +31.2% 15322841 ą 14% numa-meminfo.node1.MemUsed > > > 9874 ą 62% +104.6% 20200 ą 36% numa-meminfo.node1.PageTables > > > 0.74 -4.2% 0.71 perf-stat.i.MPKI > > > 1.245e+11 +2.3% 1.274e+11 perf-stat.i.branch-instructions > > > 0.37 -0.0 0.35 perf-stat.i.branch-miss-rate% > > > 4.359e+08 -2.1% 4.265e+08 perf-stat.i.branch-misses > > > 4.672e+08 -2.6% 4.548e+08 perf-stat.i.cache-misses > > > 7.276e+08 -2.7% 7.082e+08 perf-stat.i.cache-references > > > 1.00 -1.6% 0.98 perf-stat.i.cpi > > > 1364 +2.9% 1404 perf-stat.i.cycles-between-cache-misses > > > 6.392e+11 +1.7% 6.499e+11 perf-stat.i.instructions > > > 1.00 +1.6% 1.02 perf-stat.i.ipc > > > 0.74 -4.3% 0.71 perf-stat.overall.MPKI > > > 0.35 -0.0 0.33 perf-stat.overall.branch-miss-rate% > > > 1.00 -1.6% 0.99 perf-stat.overall.cpi > > > 1356 +2.9% 1395 perf-stat.overall.cycles-between-cache-misses > > > 1.00 +1.6% 1.01 perf-stat.overall.ipc > > > 1.209e+11 +1.9% 1.232e+11 perf-stat.ps.branch-instructions > > > 4.188e+08 -2.6% 4.077e+08 perf-stat.ps.branch-misses > > > 4.585e+08 -3.1% 4.441e+08 perf-stat.ps.cache-misses > > > 7.124e+08 -3.1% 6.901e+08 perf-stat.ps.cache-references > > > 10321 -2.6% 10053 perf-stat.ps.context-switches > > > > > > > > > > > > > > > > > > Disclaimer: > > > Results have been estimated based on internal Intel analysis and are provided > > > for informational purposes only. Any difference in system hardware or software > > > design or configuration may affect actual performance. > > > > > > > > > -- > > > 0-DAY CI Kernel Test Service > > > https://github.com/intel/lkp-tests/wiki > > > ^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2024-09-13 5:47 UTC | newest] Thread overview: 29+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-08-04 8:59 [linus:master] [mseal] 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression kernel test robot 2024-08-04 20:32 ` Linus Torvalds 2024-08-05 13:33 ` Pedro Falcato 2024-08-05 18:10 ` Jeff Xu 2024-08-05 18:55 ` Linus Torvalds 2024-08-05 19:33 ` Linus Torvalds 2024-08-06 2:14 ` Michael Ellerman 2024-08-06 2:17 ` Linus Torvalds 2024-08-06 12:03 ` Michael Ellerman 2024-08-06 14:43 ` Linus Torvalds 2024-08-07 12:26 ` Michael Ellerman 2024-08-06 6:04 ` Oliver Sang 2024-08-06 14:38 ` Linus Torvalds 2024-08-06 21:37 ` Pedro Falcato 2024-08-07 5:54 ` Oliver Sang 2024-08-05 19:37 ` Jeff Xu 2024-08-05 19:48 ` Linus Torvalds 2024-08-05 19:50 ` Linus Torvalds 2024-08-05 23:24 ` Nicholas Piggin 2024-08-06 0:13 ` Linus Torvalds 2024-08-06 1:22 ` Jeff Xu 2024-08-06 2:01 ` Michael Ellerman 2024-08-06 2:15 ` Linus Torvalds 2024-09-13 5:47 ` Christophe Leroy 2024-08-05 17:54 ` Jeff Xu 2024-08-05 13:56 ` Jeff Xu 2024-08-05 16:58 ` Jeff Xu 2024-08-06 1:44 ` Oliver Sang 2024-08-06 14:54 ` Jeff Xu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).