public inbox for cgroups@vger.kernel.org
 help / color / mirror / Atom feed
From: kernel test robot <oliver.sang@intel.com>
To: Shakeel Butt <shakeel.butt@linux.dev>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
	<cgroups@vger.kernel.org>, Tejun Heo <tj@kernel.org>,
	Jakub Kicinski <kuba@kernel.org>, <oliver.sang@intel.com>
Subject: [tj-cgroup:for-7.1] [cgroup]  4616120fca: stress-ng.mremap.ops_per_sec 48.3% improvement
Date: Sun, 22 Mar 2026 18:45:35 +0800	[thread overview]
Message-ID: <202603221824.c32929f7-lkp@intel.com> (raw)



Hello,

kernel test robot noticed a 48.3% improvement of stress-ng.mremap.ops_per_sec on:


commit: 4616120fca7f6d48b4c640e3975352e451e9c2ce ("cgroup: add lockless fast-path checks to cgroup_file_notify()")
https://git.kernel.org/cgit/linux/kernel/git/tj/cgroup.git for-7.1


testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-14
test machine: 192 threads 2 sockets Intel(R) Xeon(R) 6740E  CPU @ 2.4GHz (Sierra Forest) with 256G memory
parameters:

	nr_threads: 100%
	testtime: 60s
	test: mremap
	cpufreq_governor: performance


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20260322/202603221824.c32929f7-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  gcc-14/performance/x86_64-rhel-9.4/100%/debian-13-x86_64-20250902.cgz/lkp-srf-2sp2/mremap/stress-ng/60s

commit: 
  05070cd654 ("cgroup: reduce cgroup_file_kn_lock hold time in cgroup_file_notify()")
  4616120fca ("cgroup: add lockless fast-path checks to cgroup_file_notify()")

05070cd654f38346 4616120fca7f6d48b4c640e3975 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    670909 ±  2%     +48.3%     995054        stress-ng.mremap.ops
     11186 ±  2%     +48.3%      16595        stress-ng.mremap.ops_per_sec
     11269            -1.6%      11094        stress-ng.time.system_time
    181.60           +72.7%     313.62 ±  7%  stress-ng.time.user_time
      2870 ±  9%     +55.4%       4461 ± 18%  perf-c2c.DRAM.local
    349.27            +3.4%     361.27        turbostat.PkgWatt
    400595            +1.6%     407116        vmstat.system.in
      0.11            +0.1        0.18 ± 24%  mpstat.cpu.all.irq%
      0.06 ±  7%      +0.3        0.33 ± 37%  mpstat.cpu.all.soft%
      1.86            +1.1        2.98 ±  6%  mpstat.cpu.all.usr%
  19725586 ±  4%     +13.1%   22316349 ±  4%  perf-stat.i.branch-misses
 5.641e+08 ±  4%    +113.7%  1.206e+09 ± 19%  perf-stat.i.cache-references
  19314576           +15.6%   22334802 ±  2%  perf-stat.ps.branch-misses
 5.495e+08 ±  4%    +112.9%   1.17e+09 ± 19%  perf-stat.ps.cache-references
      0.02 ± 31%     -85.9%       0.00 ±223%  perf-stat.ps.major-faults
      0.44 ±  7%     +35.0%       0.60 ± 15%  perf-sched.sch_delay.avg.ms.[unknown].[unknown].[unknown].[unknown].[unknown]
      0.44 ±  7%     +35.0%       0.60 ± 15%  perf-sched.total_sch_delay.average.ms
    174.61            +7.1%     186.97 ±  4%  perf-sched.total_wait_and_delay.average.ms
    174.16            +7.0%     186.37 ±  4%  perf-sched.total_wait_time.average.ms
    174.61            +7.1%     186.97 ±  4%  perf-sched.wait_and_delay.avg.ms.[unknown].[unknown].[unknown].[unknown].[unknown]
    174.16            +7.0%     186.37 ±  4%  perf-sched.wait_time.avg.ms.[unknown].[unknown].[unknown].[unknown].[unknown]
     92.34           -92.3        0.00        perf-profile.calltrace.cycles-pp.__mem_cgroup_try_charge_swap.folio_alloc_swap.shrink_folio_list.reclaim_folio_list.reclaim_pages
     92.34           -92.3        0.00        perf-profile.calltrace.cycles-pp.__memcg_memory_event.__mem_cgroup_try_charge_swap.folio_alloc_swap.shrink_folio_list.reclaim_folio_list
     92.03           -92.0        0.00        perf-profile.calltrace.cycles-pp.cgroup_file_notify.__memcg_memory_event.__mem_cgroup_try_charge_swap.folio_alloc_swap.shrink_folio_list
     92.02           -92.0        0.00        perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.cgroup_file_notify.__memcg_memory_event.__mem_cgroup_try_charge_swap.folio_alloc_swap
     91.94           -91.9        0.00        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.cgroup_file_notify.__memcg_memory_event.__mem_cgroup_try_charge_swap
     92.62           -48.5       44.08 ± 77%  perf-profile.calltrace.cycles-pp.shrink_folio_list.reclaim_folio_list.reclaim_pages.madvise_cold_or_pageout_pte_range.walk_pmd_range
     92.51           -48.5       44.00 ± 77%  perf-profile.calltrace.cycles-pp.folio_alloc_swap.shrink_folio_list.reclaim_folio_list.reclaim_pages.madvise_cold_or_pageout_pte_range
     92.71           -47.8       44.90 ± 74%  perf-profile.calltrace.cycles-pp.reclaim_folio_list.reclaim_pages.madvise_cold_or_pageout_pte_range.walk_pmd_range.walk_pud_range
     92.72           -47.8       44.92 ± 74%  perf-profile.calltrace.cycles-pp.reclaim_pages.madvise_cold_or_pageout_pte_range.walk_pmd_range.walk_pud_range.walk_p4d_range
     93.60           -23.2       70.43 ± 22%  perf-profile.calltrace.cycles-pp.__walk_page_range.walk_page_range_vma_unsafe.madvise_pageout.madvise_vma_behavior.madvise_do_behavior
     93.60           -23.2       70.43 ± 22%  perf-profile.calltrace.cycles-pp.walk_p4d_range.walk_pgd_range.__walk_page_range.walk_page_range_vma_unsafe.madvise_pageout
     93.60           -23.2       70.43 ± 22%  perf-profile.calltrace.cycles-pp.walk_page_range_vma_unsafe.madvise_pageout.madvise_vma_behavior.madvise_do_behavior.do_madvise
     93.60           -23.2       70.43 ± 22%  perf-profile.calltrace.cycles-pp.walk_pgd_range.__walk_page_range.walk_page_range_vma_unsafe.madvise_pageout.madvise_vma_behavior
     93.61           -23.0       70.61 ± 22%  perf-profile.calltrace.cycles-pp.madvise_pageout.madvise_vma_behavior.madvise_do_behavior.do_madvise.__x64_sys_madvise
     93.73           -22.4       71.38 ± 21%  perf-profile.calltrace.cycles-pp.walk_pmd_range.walk_pud_range.walk_p4d_range.walk_pgd_range.__walk_page_range
     93.73           -22.4       71.38 ± 21%  perf-profile.calltrace.cycles-pp.madvise_cold_or_pageout_pte_range.walk_pmd_range.walk_pud_range.walk_p4d_range.walk_pgd_range
     93.73           -22.4       71.38 ± 21%  perf-profile.calltrace.cycles-pp.walk_pud_range.walk_p4d_range.walk_pgd_range.__walk_page_range.walk_page_range_vma_unsafe
     93.91           -21.8       72.07 ± 20%  perf-profile.calltrace.cycles-pp.madvise_vma_behavior.madvise_do_behavior.do_madvise.__x64_sys_madvise.do_syscall_64
     94.15           -20.9       73.21 ± 19%  perf-profile.calltrace.cycles-pp.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
     94.14           -20.9       73.20 ± 19%  perf-profile.calltrace.cycles-pp.madvise_do_behavior.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe
     94.15           -20.9       73.21 ± 19%  perf-profile.calltrace.cycles-pp.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
     94.15           -20.9       73.21 ± 19%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
     94.15           -20.9       73.22 ± 19%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__madvise
     94.16           -20.9       73.23 ± 19%  perf-profile.calltrace.cycles-pp.__madvise
      0.69 ±  3%      +0.4        1.13 ±  9%  perf-profile.calltrace.cycles-pp.stress_mmap_check
      0.00            +0.5        0.53 ±  3%  perf-profile.calltrace.cycles-pp.move_vma.do_mremap.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.74 ±  5%      +0.7        1.39 ± 18%  perf-profile.calltrace.cycles-pp.stress_mmap_set
      0.00           +43.5       43.45 ± 77%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.folio_alloc_swap.shrink_folio_list.reclaim_folio_list
      0.00           +43.8       43.76 ± 77%  perf-profile.calltrace.cycles-pp._raw_spin_lock.folio_alloc_swap.shrink_folio_list.reclaim_folio_list.reclaim_pages
     92.34           -92.1        0.23 ± 50%  perf-profile.children.cycles-pp.__memcg_memory_event
     92.34           -92.1        0.24 ± 49%  perf-profile.children.cycles-pp.__mem_cgroup_try_charge_swap
     92.04           -92.0        0.00        perf-profile.children.cycles-pp.cgroup_file_notify
     93.22           -71.6       21.58 ± 68%  perf-profile.children.cycles-pp._raw_spin_lock_irqsave
     92.62           -48.5       44.08 ± 77%  perf-profile.children.cycles-pp.shrink_folio_list
     92.51           -48.5       44.00 ± 77%  perf-profile.children.cycles-pp.folio_alloc_swap
     92.71           -47.8       44.90 ± 74%  perf-profile.children.cycles-pp.reclaim_folio_list
     92.72           -47.8       44.92 ± 74%  perf-profile.children.cycles-pp.reclaim_pages
     93.61           -23.0       70.61 ± 22%  perf-profile.children.cycles-pp.madvise_pageout
     93.86           -22.3       71.51 ± 21%  perf-profile.children.cycles-pp.madvise_cold_or_pageout_pte_range
     93.90           -22.3       71.59 ± 21%  perf-profile.children.cycles-pp.walk_pud_range
     93.90           -22.3       71.60 ± 21%  perf-profile.children.cycles-pp.walk_p4d_range
     93.90           -22.3       71.59 ± 21%  perf-profile.children.cycles-pp.walk_pmd_range
     93.90           -22.3       71.60 ± 21%  perf-profile.children.cycles-pp.__walk_page_range
     93.90           -22.3       71.60 ± 21%  perf-profile.children.cycles-pp.walk_pgd_range
     93.90           -22.3       71.60 ± 21%  perf-profile.children.cycles-pp.walk_page_range_vma_unsafe
     93.91           -21.8       72.07 ± 20%  perf-profile.children.cycles-pp.madvise_vma_behavior
     94.15           -20.9       73.21 ± 19%  perf-profile.children.cycles-pp.do_madvise
     94.14           -20.9       73.20 ± 19%  perf-profile.children.cycles-pp.madvise_do_behavior
     94.15           -20.9       73.21 ± 19%  perf-profile.children.cycles-pp.__x64_sys_madvise
     94.16           -20.9       73.23 ± 19%  perf-profile.children.cycles-pp.__madvise
     93.81            -3.8       90.06        perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
     98.15            -1.2       96.99        perf-profile.children.cycles-pp.do_syscall_64
     98.15            -1.2       97.00        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.08 ± 14%      +0.0        0.11 ±  6%  perf-profile.children.cycles-pp.free_unref_folios
      0.06            +0.0        0.10 ±  4%  perf-profile.children.cycles-pp.mas_store_prealloc
      0.05            +0.0        0.09 ±  4%  perf-profile.children.cycles-pp.__pi_memcpy
      0.05            +0.0        0.09 ± 22%  perf-profile.children.cycles-pp.update_process_times
      0.07            +0.0        0.12 ±  6%  perf-profile.children.cycles-pp.mas_store_gfp
      0.14 ± 12%      +0.0        0.18 ±  6%  perf-profile.children.cycles-pp.move_ptes
      0.06            +0.0        0.11 ± 22%  perf-profile.children.cycles-pp.__hrtimer_run_queues
      0.05            +0.1        0.10 ± 26%  perf-profile.children.cycles-pp.tick_nohz_handler
      0.00            +0.1        0.05 ±  7%  perf-profile.children.cycles-pp.sched_tick
      0.03 ± 70%      +0.1        0.09 ±  5%  perf-profile.children.cycles-pp.vma_link
      0.16 ± 12%      +0.1        0.21 ±  5%  perf-profile.children.cycles-pp.move_page_tables
      0.00            +0.1        0.06 ±  9%  perf-profile.children.cycles-pp.unmapped_area_topdown
      0.00            +0.1        0.06 ±  9%  perf-profile.children.cycles-pp.vm_unmapped_area
      0.01 ±223%      +0.1        0.07 ±  7%  perf-profile.children.cycles-pp.__get_unmapped_area
      0.06            +0.1        0.12 ±  6%  perf-profile.children.cycles-pp.__mmap_new_vma
      0.08            +0.1        0.14 ±  6%  perf-profile.children.cycles-pp.copy_vma
      0.00            +0.1        0.06 ± 11%  perf-profile.children.cycles-pp.arch_get_unmapped_area_topdown
      0.09 ±  4%      +0.1        0.16 ±  3%  perf-profile.children.cycles-pp.mas_wr_node_store
      0.08 ±  6%      +0.1        0.16 ± 28%  perf-profile.children.cycles-pp.hrtimer_interrupt
      0.06            +0.1        0.14 ±  5%  perf-profile.children.cycles-pp.mas_preallocate
      0.08            +0.1        0.17 ± 26%  perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
      0.07 ±  6%      +0.1        0.17 ±  6%  perf-profile.children.cycles-pp.kmem_cache_alloc_noprof
      0.13 ±  2%      +0.1        0.23 ±  2%  perf-profile.children.cycles-pp.__mmap_region
      0.00            +0.1        0.11 ±  5%  perf-profile.children.cycles-pp.__refill_objects_node
      0.01 ±223%      +0.1        0.12 ±  4%  perf-profile.children.cycles-pp.__pcs_replace_empty_main
      0.00            +0.1        0.11 ±  6%  perf-profile.children.cycles-pp.refill_objects
      0.16 ±  3%      +0.1        0.28 ±  2%  perf-profile.children.cycles-pp.do_mmap
      0.24 ±  8%      +0.1        0.35 ±  2%  perf-profile.children.cycles-pp.copy_vma_and_data
      0.35 ±  3%      +0.2        0.54 ±  3%  perf-profile.children.cycles-pp.move_vma
      0.05 ±  7%      +0.3        0.32 ± 44%  perf-profile.children.cycles-pp.__irq_exit_rcu
      0.05            +0.3        0.32 ± 43%  perf-profile.children.cycles-pp.handle_softirqs
      0.00            +0.3        0.27 ± 53%  perf-profile.children.cycles-pp.__slab_free
      0.00            +0.3        0.28 ± 50%  perf-profile.children.cycles-pp.__kmem_cache_free_bulk
      0.00            +0.3        0.29 ± 49%  perf-profile.children.cycles-pp.rcu_free_sheaf
      0.00            +0.3        0.31 ± 44%  perf-profile.children.cycles-pp.rcu_do_batch
      0.00            +0.3        0.31 ± 43%  perf-profile.children.cycles-pp.rcu_core
      0.20 ± 25%      +0.3        0.54 ±  8%  perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
      0.14 ±  2%      +0.3        0.49 ± 19%  perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
      0.69 ±  3%      +0.4        1.13 ±  9%  perf-profile.children.cycles-pp.stress_mmap_check
      0.77 ±  8%      +0.7        1.42 ± 15%  perf-profile.children.cycles-pp.stress_mmap_set
      0.19 ±  7%      +0.7        0.89 ± 51%  perf-profile.children.cycles-pp.__vm_munmap
      0.19 ±  7%      +0.7        0.89 ± 51%  perf-profile.children.cycles-pp.__x64_sys_munmap
      0.21 ±  8%      +0.7        0.92 ± 49%  perf-profile.children.cycles-pp.__munmap
      0.22 ±  7%      +0.9        1.12 ± 53%  perf-profile.children.cycles-pp.faultin_page_range
      0.28 ± 88%      +1.0        1.28 ± 47%  perf-profile.children.cycles-pp.madvise_cold
      0.25 ± 13%     +43.6       43.88 ± 76%  perf-profile.children.cycles-pp._raw_spin_lock
     93.81            -3.9       89.93        perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
      0.12 ± 13%      -0.1        0.07 ±  5%  perf-profile.self.cycles-pp._raw_spin_lock_irqsave
      0.05            +0.0        0.09 ±  4%  perf-profile.self.cycles-pp.__pi_memcpy
      0.00            +0.1        0.05        perf-profile.self.cycles-pp.move_ptes
      0.00            +0.1        0.11 ±  4%  perf-profile.self.cycles-pp.__refill_objects_node
      0.19 ± 12%      +0.2        0.41 ± 31%  perf-profile.self.cycles-pp._raw_spin_lock
      0.69 ±  3%      +0.4        1.12 ±  9%  perf-profile.self.cycles-pp.stress_mmap_check
      0.68 ± 11%      +0.6        1.32 ± 27%  perf-profile.self.cycles-pp.stress_mmap_set




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


                 reply	other threads:[~2026-03-22 10:45 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=202603221824.c32929f7-lkp@intel.com \
    --to=oliver.sang@intel.com \
    --cc=cgroups@vger.kernel.org \
    --cc=kuba@kernel.org \
    --cc=lkp@intel.com \
    --cc=oe-lkp@lists.linux.dev \
    --cc=shakeel.butt@linux.dev \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox