* [tj-cgroup:for-7.1] [cgroup] 4616120fca: stress-ng.mremap.ops_per_sec 48.3% improvement
@ 2026-03-22 10:45 kernel test robot
0 siblings, 0 replies; only message in thread
From: kernel test robot @ 2026-03-22 10:45 UTC (permalink / raw)
To: Shakeel Butt; +Cc: oe-lkp, lkp, cgroups, Tejun Heo, Jakub Kicinski, oliver.sang
Hello,
kernel test robot noticed a 48.3% improvement of stress-ng.mremap.ops_per_sec on:
commit: 4616120fca7f6d48b4c640e3975352e451e9c2ce ("cgroup: add lockless fast-path checks to cgroup_file_notify()")
https://git.kernel.org/cgit/linux/kernel/git/tj/cgroup.git for-7.1
testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-14
test machine: 192 threads 2 sockets Intel(R) Xeon(R) 6740E CPU @ 2.4GHz (Sierra Forest) with 256G memory
parameters:
nr_threads: 100%
testtime: 60s
test: mremap
cpufreq_governor: performance
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20260322/202603221824.c32929f7-lkp@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-14/performance/x86_64-rhel-9.4/100%/debian-13-x86_64-20250902.cgz/lkp-srf-2sp2/mremap/stress-ng/60s
commit:
05070cd654 ("cgroup: reduce cgroup_file_kn_lock hold time in cgroup_file_notify()")
4616120fca ("cgroup: add lockless fast-path checks to cgroup_file_notify()")
05070cd654f38346 4616120fca7f6d48b4c640e3975
---------------- ---------------------------
%stddev %change %stddev
\ | \
670909 ± 2% +48.3% 995054 stress-ng.mremap.ops
11186 ± 2% +48.3% 16595 stress-ng.mremap.ops_per_sec
11269 -1.6% 11094 stress-ng.time.system_time
181.60 +72.7% 313.62 ± 7% stress-ng.time.user_time
2870 ± 9% +55.4% 4461 ± 18% perf-c2c.DRAM.local
349.27 +3.4% 361.27 turbostat.PkgWatt
400595 +1.6% 407116 vmstat.system.in
0.11 +0.1 0.18 ± 24% mpstat.cpu.all.irq%
0.06 ± 7% +0.3 0.33 ± 37% mpstat.cpu.all.soft%
1.86 +1.1 2.98 ± 6% mpstat.cpu.all.usr%
19725586 ± 4% +13.1% 22316349 ± 4% perf-stat.i.branch-misses
5.641e+08 ± 4% +113.7% 1.206e+09 ± 19% perf-stat.i.cache-references
19314576 +15.6% 22334802 ± 2% perf-stat.ps.branch-misses
5.495e+08 ± 4% +112.9% 1.17e+09 ± 19% perf-stat.ps.cache-references
0.02 ± 31% -85.9% 0.00 ±223% perf-stat.ps.major-faults
0.44 ± 7% +35.0% 0.60 ± 15% perf-sched.sch_delay.avg.ms.[unknown].[unknown].[unknown].[unknown].[unknown]
0.44 ± 7% +35.0% 0.60 ± 15% perf-sched.total_sch_delay.average.ms
174.61 +7.1% 186.97 ± 4% perf-sched.total_wait_and_delay.average.ms
174.16 +7.0% 186.37 ± 4% perf-sched.total_wait_time.average.ms
174.61 +7.1% 186.97 ± 4% perf-sched.wait_and_delay.avg.ms.[unknown].[unknown].[unknown].[unknown].[unknown]
174.16 +7.0% 186.37 ± 4% perf-sched.wait_time.avg.ms.[unknown].[unknown].[unknown].[unknown].[unknown]
92.34 -92.3 0.00 perf-profile.calltrace.cycles-pp.__mem_cgroup_try_charge_swap.folio_alloc_swap.shrink_folio_list.reclaim_folio_list.reclaim_pages
92.34 -92.3 0.00 perf-profile.calltrace.cycles-pp.__memcg_memory_event.__mem_cgroup_try_charge_swap.folio_alloc_swap.shrink_folio_list.reclaim_folio_list
92.03 -92.0 0.00 perf-profile.calltrace.cycles-pp.cgroup_file_notify.__memcg_memory_event.__mem_cgroup_try_charge_swap.folio_alloc_swap.shrink_folio_list
92.02 -92.0 0.00 perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.cgroup_file_notify.__memcg_memory_event.__mem_cgroup_try_charge_swap.folio_alloc_swap
91.94 -91.9 0.00 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.cgroup_file_notify.__memcg_memory_event.__mem_cgroup_try_charge_swap
92.62 -48.5 44.08 ± 77% perf-profile.calltrace.cycles-pp.shrink_folio_list.reclaim_folio_list.reclaim_pages.madvise_cold_or_pageout_pte_range.walk_pmd_range
92.51 -48.5 44.00 ± 77% perf-profile.calltrace.cycles-pp.folio_alloc_swap.shrink_folio_list.reclaim_folio_list.reclaim_pages.madvise_cold_or_pageout_pte_range
92.71 -47.8 44.90 ± 74% perf-profile.calltrace.cycles-pp.reclaim_folio_list.reclaim_pages.madvise_cold_or_pageout_pte_range.walk_pmd_range.walk_pud_range
92.72 -47.8 44.92 ± 74% perf-profile.calltrace.cycles-pp.reclaim_pages.madvise_cold_or_pageout_pte_range.walk_pmd_range.walk_pud_range.walk_p4d_range
93.60 -23.2 70.43 ± 22% perf-profile.calltrace.cycles-pp.__walk_page_range.walk_page_range_vma_unsafe.madvise_pageout.madvise_vma_behavior.madvise_do_behavior
93.60 -23.2 70.43 ± 22% perf-profile.calltrace.cycles-pp.walk_p4d_range.walk_pgd_range.__walk_page_range.walk_page_range_vma_unsafe.madvise_pageout
93.60 -23.2 70.43 ± 22% perf-profile.calltrace.cycles-pp.walk_page_range_vma_unsafe.madvise_pageout.madvise_vma_behavior.madvise_do_behavior.do_madvise
93.60 -23.2 70.43 ± 22% perf-profile.calltrace.cycles-pp.walk_pgd_range.__walk_page_range.walk_page_range_vma_unsafe.madvise_pageout.madvise_vma_behavior
93.61 -23.0 70.61 ± 22% perf-profile.calltrace.cycles-pp.madvise_pageout.madvise_vma_behavior.madvise_do_behavior.do_madvise.__x64_sys_madvise
93.73 -22.4 71.38 ± 21% perf-profile.calltrace.cycles-pp.walk_pmd_range.walk_pud_range.walk_p4d_range.walk_pgd_range.__walk_page_range
93.73 -22.4 71.38 ± 21% perf-profile.calltrace.cycles-pp.madvise_cold_or_pageout_pte_range.walk_pmd_range.walk_pud_range.walk_p4d_range.walk_pgd_range
93.73 -22.4 71.38 ± 21% perf-profile.calltrace.cycles-pp.walk_pud_range.walk_p4d_range.walk_pgd_range.__walk_page_range.walk_page_range_vma_unsafe
93.91 -21.8 72.07 ± 20% perf-profile.calltrace.cycles-pp.madvise_vma_behavior.madvise_do_behavior.do_madvise.__x64_sys_madvise.do_syscall_64
94.15 -20.9 73.21 ± 19% perf-profile.calltrace.cycles-pp.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
94.14 -20.9 73.20 ± 19% perf-profile.calltrace.cycles-pp.madvise_do_behavior.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe
94.15 -20.9 73.21 ± 19% perf-profile.calltrace.cycles-pp.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
94.15 -20.9 73.21 ± 19% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
94.15 -20.9 73.22 ± 19% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__madvise
94.16 -20.9 73.23 ± 19% perf-profile.calltrace.cycles-pp.__madvise
0.69 ± 3% +0.4 1.13 ± 9% perf-profile.calltrace.cycles-pp.stress_mmap_check
0.00 +0.5 0.53 ± 3% perf-profile.calltrace.cycles-pp.move_vma.do_mremap.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.74 ± 5% +0.7 1.39 ± 18% perf-profile.calltrace.cycles-pp.stress_mmap_set
0.00 +43.5 43.45 ± 77% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.folio_alloc_swap.shrink_folio_list.reclaim_folio_list
0.00 +43.8 43.76 ± 77% perf-profile.calltrace.cycles-pp._raw_spin_lock.folio_alloc_swap.shrink_folio_list.reclaim_folio_list.reclaim_pages
92.34 -92.1 0.23 ± 50% perf-profile.children.cycles-pp.__memcg_memory_event
92.34 -92.1 0.24 ± 49% perf-profile.children.cycles-pp.__mem_cgroup_try_charge_swap
92.04 -92.0 0.00 perf-profile.children.cycles-pp.cgroup_file_notify
93.22 -71.6 21.58 ± 68% perf-profile.children.cycles-pp._raw_spin_lock_irqsave
92.62 -48.5 44.08 ± 77% perf-profile.children.cycles-pp.shrink_folio_list
92.51 -48.5 44.00 ± 77% perf-profile.children.cycles-pp.folio_alloc_swap
92.71 -47.8 44.90 ± 74% perf-profile.children.cycles-pp.reclaim_folio_list
92.72 -47.8 44.92 ± 74% perf-profile.children.cycles-pp.reclaim_pages
93.61 -23.0 70.61 ± 22% perf-profile.children.cycles-pp.madvise_pageout
93.86 -22.3 71.51 ± 21% perf-profile.children.cycles-pp.madvise_cold_or_pageout_pte_range
93.90 -22.3 71.59 ± 21% perf-profile.children.cycles-pp.walk_pud_range
93.90 -22.3 71.60 ± 21% perf-profile.children.cycles-pp.walk_p4d_range
93.90 -22.3 71.59 ± 21% perf-profile.children.cycles-pp.walk_pmd_range
93.90 -22.3 71.60 ± 21% perf-profile.children.cycles-pp.__walk_page_range
93.90 -22.3 71.60 ± 21% perf-profile.children.cycles-pp.walk_pgd_range
93.90 -22.3 71.60 ± 21% perf-profile.children.cycles-pp.walk_page_range_vma_unsafe
93.91 -21.8 72.07 ± 20% perf-profile.children.cycles-pp.madvise_vma_behavior
94.15 -20.9 73.21 ± 19% perf-profile.children.cycles-pp.do_madvise
94.14 -20.9 73.20 ± 19% perf-profile.children.cycles-pp.madvise_do_behavior
94.15 -20.9 73.21 ± 19% perf-profile.children.cycles-pp.__x64_sys_madvise
94.16 -20.9 73.23 ± 19% perf-profile.children.cycles-pp.__madvise
93.81 -3.8 90.06 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
98.15 -1.2 96.99 perf-profile.children.cycles-pp.do_syscall_64
98.15 -1.2 97.00 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
0.08 ± 14% +0.0 0.11 ± 6% perf-profile.children.cycles-pp.free_unref_folios
0.06 +0.0 0.10 ± 4% perf-profile.children.cycles-pp.mas_store_prealloc
0.05 +0.0 0.09 ± 4% perf-profile.children.cycles-pp.__pi_memcpy
0.05 +0.0 0.09 ± 22% perf-profile.children.cycles-pp.update_process_times
0.07 +0.0 0.12 ± 6% perf-profile.children.cycles-pp.mas_store_gfp
0.14 ± 12% +0.0 0.18 ± 6% perf-profile.children.cycles-pp.move_ptes
0.06 +0.0 0.11 ± 22% perf-profile.children.cycles-pp.__hrtimer_run_queues
0.05 +0.1 0.10 ± 26% perf-profile.children.cycles-pp.tick_nohz_handler
0.00 +0.1 0.05 ± 7% perf-profile.children.cycles-pp.sched_tick
0.03 ± 70% +0.1 0.09 ± 5% perf-profile.children.cycles-pp.vma_link
0.16 ± 12% +0.1 0.21 ± 5% perf-profile.children.cycles-pp.move_page_tables
0.00 +0.1 0.06 ± 9% perf-profile.children.cycles-pp.unmapped_area_topdown
0.00 +0.1 0.06 ± 9% perf-profile.children.cycles-pp.vm_unmapped_area
0.01 ±223% +0.1 0.07 ± 7% perf-profile.children.cycles-pp.__get_unmapped_area
0.06 +0.1 0.12 ± 6% perf-profile.children.cycles-pp.__mmap_new_vma
0.08 +0.1 0.14 ± 6% perf-profile.children.cycles-pp.copy_vma
0.00 +0.1 0.06 ± 11% perf-profile.children.cycles-pp.arch_get_unmapped_area_topdown
0.09 ± 4% +0.1 0.16 ± 3% perf-profile.children.cycles-pp.mas_wr_node_store
0.08 ± 6% +0.1 0.16 ± 28% perf-profile.children.cycles-pp.hrtimer_interrupt
0.06 +0.1 0.14 ± 5% perf-profile.children.cycles-pp.mas_preallocate
0.08 +0.1 0.17 ± 26% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
0.07 ± 6% +0.1 0.17 ± 6% perf-profile.children.cycles-pp.kmem_cache_alloc_noprof
0.13 ± 2% +0.1 0.23 ± 2% perf-profile.children.cycles-pp.__mmap_region
0.00 +0.1 0.11 ± 5% perf-profile.children.cycles-pp.__refill_objects_node
0.01 ±223% +0.1 0.12 ± 4% perf-profile.children.cycles-pp.__pcs_replace_empty_main
0.00 +0.1 0.11 ± 6% perf-profile.children.cycles-pp.refill_objects
0.16 ± 3% +0.1 0.28 ± 2% perf-profile.children.cycles-pp.do_mmap
0.24 ± 8% +0.1 0.35 ± 2% perf-profile.children.cycles-pp.copy_vma_and_data
0.35 ± 3% +0.2 0.54 ± 3% perf-profile.children.cycles-pp.move_vma
0.05 ± 7% +0.3 0.32 ± 44% perf-profile.children.cycles-pp.__irq_exit_rcu
0.05 +0.3 0.32 ± 43% perf-profile.children.cycles-pp.handle_softirqs
0.00 +0.3 0.27 ± 53% perf-profile.children.cycles-pp.__slab_free
0.00 +0.3 0.28 ± 50% perf-profile.children.cycles-pp.__kmem_cache_free_bulk
0.00 +0.3 0.29 ± 49% perf-profile.children.cycles-pp.rcu_free_sheaf
0.00 +0.3 0.31 ± 44% perf-profile.children.cycles-pp.rcu_do_batch
0.00 +0.3 0.31 ± 43% perf-profile.children.cycles-pp.rcu_core
0.20 ± 25% +0.3 0.54 ± 8% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
0.14 ± 2% +0.3 0.49 ± 19% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
0.69 ± 3% +0.4 1.13 ± 9% perf-profile.children.cycles-pp.stress_mmap_check
0.77 ± 8% +0.7 1.42 ± 15% perf-profile.children.cycles-pp.stress_mmap_set
0.19 ± 7% +0.7 0.89 ± 51% perf-profile.children.cycles-pp.__vm_munmap
0.19 ± 7% +0.7 0.89 ± 51% perf-profile.children.cycles-pp.__x64_sys_munmap
0.21 ± 8% +0.7 0.92 ± 49% perf-profile.children.cycles-pp.__munmap
0.22 ± 7% +0.9 1.12 ± 53% perf-profile.children.cycles-pp.faultin_page_range
0.28 ± 88% +1.0 1.28 ± 47% perf-profile.children.cycles-pp.madvise_cold
0.25 ± 13% +43.6 43.88 ± 76% perf-profile.children.cycles-pp._raw_spin_lock
93.81 -3.9 89.93 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
0.12 ± 13% -0.1 0.07 ± 5% perf-profile.self.cycles-pp._raw_spin_lock_irqsave
0.05 +0.0 0.09 ± 4% perf-profile.self.cycles-pp.__pi_memcpy
0.00 +0.1 0.05 perf-profile.self.cycles-pp.move_ptes
0.00 +0.1 0.11 ± 4% perf-profile.self.cycles-pp.__refill_objects_node
0.19 ± 12% +0.2 0.41 ± 31% perf-profile.self.cycles-pp._raw_spin_lock
0.69 ± 3% +0.4 1.12 ± 9% perf-profile.self.cycles-pp.stress_mmap_check
0.68 ± 11% +0.6 1.32 ± 27% perf-profile.self.cycles-pp.stress_mmap_set
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2026-03-22 10:45 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-22 10:45 [tj-cgroup:for-7.1] [cgroup] 4616120fca: stress-ng.mremap.ops_per_sec 48.3% improvement kernel test robot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox