From: kernel test robot <oliver.sang@intel.com>
To: Shakeel Butt <shakeel.butt@linux.dev>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
Tejun Heo <tj@kernel.org>, "JP Kobryn" <inwardvessel@gmail.com>,
<cgroups@vger.kernel.org>, <oliver.sang@intel.com>
Subject: [linux-next:master] [cgroup] 36df6e3dbd: will-it-scale.per_process_ops 2.9% improvement
Date: Thu, 31 Jul 2025 15:31:46 +0800 [thread overview]
Message-ID: <202507310831.cf3e212e-lkp@intel.com> (raw)
Hello,
kernel test robot noticed a 2.9% improvement of will-it-scale.per_process_ops on:
commit: 36df6e3dbd7e7b074e55fec080012184e2fa3a46 ("cgroup: make css_rstat_updated nmi safe")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
testcase: will-it-scale
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 192 threads 2 sockets Intel(R) Xeon(R) Platinum 8468V CPU @ 2.4GHz (Sapphire Rapids) with 384G memory
parameters:
nr_task: 100%
mode: process
test: tlb_flush2
cpufreq_governor: performance
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250731/202507310831.cf3e212e-lkp@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-12/performance/x86_64-rhel-9.4/process/100%/debian-12-x86_64-20240206.cgz/igk-spr-2sp3/tlb_flush2/will-it-scale
commit:
1257b8786a ("cgroup: support to enable nmi-safe css_rstat_updated")
36df6e3dbd ("cgroup: make css_rstat_updated nmi safe")
1257b8786ac689a2 36df6e3dbd7e7b074e55fec0800
---------------- ---------------------------
%stddev %change %stddev
\ | \
2.78 ± 2% +0.3 3.11 mpstat.cpu.all.usr%
522283 ± 32% +29.1% 674402 ± 5% sched_debug.cpu.avg_idle.min
11822911 +2.9% 12170263 will-it-scale.192.processes
61577 +2.9% 63386 will-it-scale.per_process_ops
11822911 +2.9% 12170263 will-it-scale.workload
2.98 ± 11% -25.3% 2.23 ± 31% perf-sched.sch_delay.max.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open
4312 ± 11% +13.1% 4878 perf-sched.total_wait_and_delay.max.ms
4312 ± 11% +13.1% 4878 perf-sched.total_wait_time.max.ms
320.37 ±104% +191.2% 932.90 ± 14% perf-sched.wait_and_delay.max.ms.__cond_resched.process_one_work.worker_thread.kthread.ret_from_fork
365.55 ± 83% +155.2% 932.79 ± 14% perf-sched.wait_time.max.ms.__cond_resched.process_one_work.worker_thread.kthread.ret_from_fork
2.98 ± 11% -32.0% 2.03 ± 32% perf-sched.wait_time.max.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open
18415 +2.9% 18955 proc-vmstat.nr_kernel_stack
1.791e+09 +3.1% 1.848e+09 proc-vmstat.nr_unaccepted
3.59e+09 +3.0% 3.697e+09 proc-vmstat.numa_interleave
3.589e+09 +3.0% 3.697e+09 proc-vmstat.pgalloc_dma32
7.12e+09 +3.0% 7.334e+09 proc-vmstat.pglazyfree
3.589e+09 +3.0% 3.696e+09 proc-vmstat.pgskip_device
2.716e+10 +1.7% 2.761e+10 perf-stat.i.branch-instructions
0.15 +0.0 0.15 perf-stat.i.branch-miss-rate%
38117449 +3.6% 39497918 perf-stat.i.branch-misses
4.20 -1.1% 4.16 perf-stat.i.cpi
0.24 +1.1% 0.24 perf-stat.i.ipc
245.58 +3.0% 252.83 perf-stat.i.metric.K/sec
23582407 +2.9% 24272602 perf-stat.i.minor-faults
23582407 +2.9% 24272602 perf-stat.i.page-faults
0.14 +0.0 0.14 perf-stat.overall.branch-miss-rate%
4.21 -1.1% 4.16 perf-stat.overall.cpi
3359915 -1.8% 3300246 perf-stat.overall.path-length
2.706e+10 +1.7% 2.752e+10 perf-stat.ps.branch-instructions
37940559 +3.7% 39340939 perf-stat.ps.branch-misses
23496794 +2.9% 24189927 perf-stat.ps.minor-faults
23496794 +2.9% 24189927 perf-stat.ps.page-faults
3.972e+13 +1.1% 4.016e+13 perf-stat.total.instructions
58.13 -1.6 56.50 perf-profile.calltrace.cycles-pp.folio_batch_move_lru.folio_add_lru.do_anonymous_page.__handle_mm_fault.handle_mm_fault
58.24 -1.6 56.62 perf-profile.calltrace.cycles-pp.folio_add_lru.do_anonymous_page.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
59.95 -1.5 58.46 perf-profile.calltrace.cycles-pp.do_anonymous_page.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
60.58 -1.4 59.16 perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
61.08 -1.4 59.69 perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
62.15 -1.3 60.87 perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
62.26 -1.3 60.99 perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault.testcase
28.72 -1.1 27.64 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru
28.74 -1.1 27.66 perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru.do_anonymous_page
28.74 -1.1 27.66 perf-profile.calltrace.cycles-pp.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru.do_anonymous_page.__handle_mm_fault
66.84 -0.9 65.93 perf-profile.calltrace.cycles-pp.asm_exc_page_fault.testcase
67.47 -0.8 66.62 perf-profile.calltrace.cycles-pp.testcase
28.90 -0.6 28.27 perf-profile.calltrace.cycles-pp.folios_put_refs.folio_batch_move_lru.folio_add_lru.do_anonymous_page.__handle_mm_fault
0.54 ± 4% +0.1 0.59 perf-profile.calltrace.cycles-pp.__alloc_frozen_pages_noprof.alloc_pages_mpol.vma_alloc_folio_noprof.alloc_anon_folio.do_anonymous_page
0.60 ± 4% +0.1 0.66 perf-profile.calltrace.cycles-pp.alloc_pages_mpol.vma_alloc_folio_noprof.alloc_anon_folio.do_anonymous_page.__handle_mm_fault
0.68 ± 3% +0.1 0.74 perf-profile.calltrace.cycles-pp.vma_alloc_folio_noprof.alloc_anon_folio.do_anonymous_page.__handle_mm_fault.handle_mm_fault
0.84 ± 4% +0.1 0.92 perf-profile.calltrace.cycles-pp.unmap_page_range.zap_page_range_single_batched.madvise_dontneed_free.madvise_vma_behavior.madvise_do_behavior
0.94 ± 4% +0.1 1.03 perf-profile.calltrace.cycles-pp.zap_page_range_single_batched.madvise_dontneed_free.madvise_vma_behavior.madvise_do_behavior.do_madvise
1.00 ± 4% +0.1 1.10 perf-profile.calltrace.cycles-pp.madvise_dontneed_free.madvise_vma_behavior.madvise_do_behavior.do_madvise.__x64_sys_madvise
0.92 ± 3% +0.1 1.02 perf-profile.calltrace.cycles-pp.alloc_anon_folio.do_anonymous_page.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
1.06 ± 4% +0.1 1.16 perf-profile.calltrace.cycles-pp.madvise_vma_behavior.madvise_do_behavior.do_madvise.__x64_sys_madvise.do_syscall_64
1.48 ± 4% +0.2 1.64 perf-profile.calltrace.cycles-pp.madvise_do_behavior.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.26 ±100% +0.3 0.55 perf-profile.calltrace.cycles-pp.lock_vma_under_rcu.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
0.60 ± 6% +0.3 0.95 perf-profile.calltrace.cycles-pp.flush_tlb_func.flush_tlb_mm_range.tlb_finish_mmu.do_madvise.__x64_sys_madvise
0.77 ± 5% +0.4 1.14 perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.tlb_finish_mmu.do_madvise.__x64_sys_madvise.do_syscall_64
0.08 ±223% +0.5 0.59 perf-profile.calltrace.cycles-pp.native_flush_tlb_one_user.flush_tlb_func.flush_tlb_mm_range.tlb_finish_mmu.do_madvise
32.13 ± 2% +0.8 32.98 perf-profile.calltrace.cycles-pp.__madvise
58.20 -1.6 56.57 perf-profile.children.cycles-pp.folio_batch_move_lru
58.30 -1.6 56.68 perf-profile.children.cycles-pp.folio_add_lru
59.98 -1.5 58.50 perf-profile.children.cycles-pp.do_anonymous_page
60.60 -1.4 59.18 perf-profile.children.cycles-pp.__handle_mm_fault
61.12 -1.4 59.73 perf-profile.children.cycles-pp.handle_mm_fault
62.19 -1.3 60.90 perf-profile.children.cycles-pp.do_user_addr_fault
62.28 -1.3 61.01 perf-profile.children.cycles-pp.exc_page_fault
67.07 -0.9 66.15 perf-profile.children.cycles-pp.asm_exc_page_fault
0.14 ± 5% -0.1 0.09 ± 4% perf-profile.children.cycles-pp.css_rstat_updated
0.11 ± 3% -0.0 0.10 ± 4% perf-profile.children.cycles-pp.record__mmap_read_evlist
0.12 ± 4% -0.0 0.10 ± 6% perf-profile.children.cycles-pp.handle_internal_command
0.12 ± 4% -0.0 0.10 ± 6% perf-profile.children.cycles-pp.main
0.12 ± 4% -0.0 0.10 ± 6% perf-profile.children.cycles-pp.run_builtin
0.12 ± 4% -0.0 0.10 ± 6% perf-profile.children.cycles-pp.__cmd_record
0.12 ± 4% -0.0 0.10 ± 6% perf-profile.children.cycles-pp.cmd_record
0.11 -0.0 0.10 ± 4% perf-profile.children.cycles-pp.perf_mmap__push
0.10 ± 3% +0.0 0.11 perf-profile.children.cycles-pp.access_error
0.11 ± 3% +0.0 0.12 perf-profile.children.cycles-pp.update_process_times
0.16 ± 5% +0.0 0.18 ± 2% perf-profile.children.cycles-pp.clear_page_erms
0.19 ± 3% +0.0 0.22 ± 3% perf-profile.children.cycles-pp.__mem_cgroup_charge
0.30 ± 5% +0.0 0.34 perf-profile.children.cycles-pp.find_vma_prev
0.45 ± 4% +0.0 0.49 perf-profile.children.cycles-pp.get_page_from_freelist
0.26 ± 9% +0.0 0.31 perf-profile.children.cycles-pp.lru_gen_add_folio
0.41 ± 3% +0.0 0.45 perf-profile.children.cycles-pp.mas_walk
0.00 +0.1 0.05 perf-profile.children.cycles-pp.get_vma_policy
0.52 ± 4% +0.1 0.57 perf-profile.children.cycles-pp.lock_vma_under_rcu
0.56 ± 4% +0.1 0.61 perf-profile.children.cycles-pp.__alloc_frozen_pages_noprof
0.63 ± 3% +0.1 0.69 perf-profile.children.cycles-pp.alloc_pages_mpol
0.68 ± 3% +0.1 0.75 perf-profile.children.cycles-pp.vma_alloc_folio_noprof
0.38 ± 9% +0.1 0.44 perf-profile.children.cycles-pp.lru_add
0.96 ± 4% +0.1 1.04 perf-profile.children.cycles-pp.unmap_page_range
0.94 ± 4% +0.1 1.04 perf-profile.children.cycles-pp.zap_page_range_single_batched
0.94 ± 4% +0.1 1.04 perf-profile.children.cycles-pp.alloc_anon_folio
1.01 ± 4% +0.1 1.11 perf-profile.children.cycles-pp.madvise_dontneed_free
1.06 ± 4% +0.1 1.17 perf-profile.children.cycles-pp.madvise_vma_behavior
0.01 ±223% +0.1 0.12 perf-profile.children.cycles-pp.mm_needs_global_asid
0.47 ± 5% +0.1 0.59 perf-profile.children.cycles-pp.native_flush_tlb_one_user
1.49 ± 4% +0.2 1.64 perf-profile.children.cycles-pp.madvise_do_behavior
0.62 ± 5% +0.4 0.97 perf-profile.children.cycles-pp.flush_tlb_func
0.79 ± 6% +0.4 1.17 perf-profile.children.cycles-pp.flush_tlb_mm_range
32.30 ± 2% +0.9 33.16 perf-profile.children.cycles-pp.__madvise
0.21 ± 10% -0.1 0.09 ± 5% perf-profile.self.cycles-pp.free_pages_and_swap_cache
0.13 ± 6% -0.1 0.07 ± 10% perf-profile.self.cycles-pp.css_rstat_updated
0.05 +0.0 0.06 perf-profile.self.cycles-pp._raw_spin_trylock
0.05 +0.0 0.06 perf-profile.self.cycles-pp.zap_page_range_single_batched
0.06 +0.0 0.07 perf-profile.self.cycles-pp.find_vma_prev
0.07 ± 6% +0.0 0.09 ± 4% perf-profile.self.cycles-pp.folio_batch_move_lru
0.14 ± 4% +0.0 0.16 ± 3% perf-profile.self.cycles-pp.unmap_page_range
0.12 ± 6% +0.0 0.14 perf-profile.self.cycles-pp.flush_tlb_mm_range
0.10 ± 8% +0.0 0.12 ± 3% perf-profile.self.cycles-pp.lru_add
0.21 ± 3% +0.0 0.23 ± 2% perf-profile.self.cycles-pp.handle_mm_fault
0.18 ± 9% +0.0 0.21 perf-profile.self.cycles-pp.lru_gen_add_folio
0.39 ± 3% +0.0 0.44 perf-profile.self.cycles-pp.mas_walk
0.00 +0.1 0.12 ± 4% perf-profile.self.cycles-pp.mm_needs_global_asid
0.46 ± 5% +0.1 0.58 perf-profile.self.cycles-pp.native_flush_tlb_one_user
0.11 ± 7% +0.2 0.27 perf-profile.self.cycles-pp.flush_tlb_func
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
reply other threads:[~2025-07-31 7:32 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=202507310831.cf3e212e-lkp@intel.com \
--to=oliver.sang@intel.com \
--cc=cgroups@vger.kernel.org \
--cc=inwardvessel@gmail.com \
--cc=lkp@intel.com \
--cc=oe-lkp@lists.linux.dev \
--cc=shakeel.butt@linux.dev \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.