From: kernel test robot <oliver.sang@intel.com>
To: Shakeel Butt <shakeel.butt@linux.dev>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
Tejun Heo <tj@kernel.org>, "JP Kobryn" <inwardvessel@gmail.com>,
<cgroups@vger.kernel.org>, <oliver.sang@intel.com>
Subject: [linux-next:master] [cgroup] 36df6e3dbd: will-it-scale.per_process_ops 2.9% improvement
Date: Thu, 31 Jul 2025 15:31:46 +0800 [thread overview]
Message-ID: <202507310831.cf3e212e-lkp@intel.com> (raw)
Hello,
kernel test robot noticed a 2.9% improvement of will-it-scale.per_process_ops on:
commit: 36df6e3dbd7e7b074e55fec080012184e2fa3a46 ("cgroup: make css_rstat_updated nmi safe")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
testcase: will-it-scale
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 192 threads 2 sockets Intel(R) Xeon(R) Platinum 8468V CPU @ 2.4GHz (Sapphire Rapids) with 384G memory
parameters:
nr_task: 100%
mode: process
test: tlb_flush2
cpufreq_governor: performance
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250731/202507310831.cf3e212e-lkp@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-12/performance/x86_64-rhel-9.4/process/100%/debian-12-x86_64-20240206.cgz/igk-spr-2sp3/tlb_flush2/will-it-scale
commit:
1257b8786a ("cgroup: support to enable nmi-safe css_rstat_updated")
36df6e3dbd ("cgroup: make css_rstat_updated nmi safe")
1257b8786ac689a2 36df6e3dbd7e7b074e55fec0800
---------------- ---------------------------
%stddev %change %stddev
\ | \
2.78 ± 2% +0.3 3.11 mpstat.cpu.all.usr%
522283 ± 32% +29.1% 674402 ± 5% sched_debug.cpu.avg_idle.min
11822911 +2.9% 12170263 will-it-scale.192.processes
61577 +2.9% 63386 will-it-scale.per_process_ops
11822911 +2.9% 12170263 will-it-scale.workload
2.98 ± 11% -25.3% 2.23 ± 31% perf-sched.sch_delay.max.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open
4312 ± 11% +13.1% 4878 perf-sched.total_wait_and_delay.max.ms
4312 ± 11% +13.1% 4878 perf-sched.total_wait_time.max.ms
320.37 ±104% +191.2% 932.90 ± 14% perf-sched.wait_and_delay.max.ms.__cond_resched.process_one_work.worker_thread.kthread.ret_from_fork
365.55 ± 83% +155.2% 932.79 ± 14% perf-sched.wait_time.max.ms.__cond_resched.process_one_work.worker_thread.kthread.ret_from_fork
2.98 ± 11% -32.0% 2.03 ± 32% perf-sched.wait_time.max.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open
18415 +2.9% 18955 proc-vmstat.nr_kernel_stack
1.791e+09 +3.1% 1.848e+09 proc-vmstat.nr_unaccepted
3.59e+09 +3.0% 3.697e+09 proc-vmstat.numa_interleave
3.589e+09 +3.0% 3.697e+09 proc-vmstat.pgalloc_dma32
7.12e+09 +3.0% 7.334e+09 proc-vmstat.pglazyfree
3.589e+09 +3.0% 3.696e+09 proc-vmstat.pgskip_device
2.716e+10 +1.7% 2.761e+10 perf-stat.i.branch-instructions
0.15 +0.0 0.15 perf-stat.i.branch-miss-rate%
38117449 +3.6% 39497918 perf-stat.i.branch-misses
4.20 -1.1% 4.16 perf-stat.i.cpi
0.24 +1.1% 0.24 perf-stat.i.ipc
245.58 +3.0% 252.83 perf-stat.i.metric.K/sec
23582407 +2.9% 24272602 perf-stat.i.minor-faults
23582407 +2.9% 24272602 perf-stat.i.page-faults
0.14 +0.0 0.14 perf-stat.overall.branch-miss-rate%
4.21 -1.1% 4.16 perf-stat.overall.cpi
3359915 -1.8% 3300246 perf-stat.overall.path-length
2.706e+10 +1.7% 2.752e+10 perf-stat.ps.branch-instructions
37940559 +3.7% 39340939 perf-stat.ps.branch-misses
23496794 +2.9% 24189927 perf-stat.ps.minor-faults
23496794 +2.9% 24189927 perf-stat.ps.page-faults
3.972e+13 +1.1% 4.016e+13 perf-stat.total.instructions
58.13 -1.6 56.50 perf-profile.calltrace.cycles-pp.folio_batch_move_lru.folio_add_lru.do_anonymous_page.__handle_mm_fault.handle_mm_fault
58.24 -1.6 56.62 perf-profile.calltrace.cycles-pp.folio_add_lru.do_anonymous_page.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
59.95 -1.5 58.46 perf-profile.calltrace.cycles-pp.do_anonymous_page.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
60.58 -1.4 59.16 perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
61.08 -1.4 59.69 perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
62.15 -1.3 60.87 perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
62.26 -1.3 60.99 perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault.testcase
28.72 -1.1 27.64 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru
28.74 -1.1 27.66 perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru.do_anonymous_page
28.74 -1.1 27.66 perf-profile.calltrace.cycles-pp.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru.do_anonymous_page.__handle_mm_fault
66.84 -0.9 65.93 perf-profile.calltrace.cycles-pp.asm_exc_page_fault.testcase
67.47 -0.8 66.62 perf-profile.calltrace.cycles-pp.testcase
28.90 -0.6 28.27 perf-profile.calltrace.cycles-pp.folios_put_refs.folio_batch_move_lru.folio_add_lru.do_anonymous_page.__handle_mm_fault
0.54 ± 4% +0.1 0.59 perf-profile.calltrace.cycles-pp.__alloc_frozen_pages_noprof.alloc_pages_mpol.vma_alloc_folio_noprof.alloc_anon_folio.do_anonymous_page
0.60 ± 4% +0.1 0.66 perf-profile.calltrace.cycles-pp.alloc_pages_mpol.vma_alloc_folio_noprof.alloc_anon_folio.do_anonymous_page.__handle_mm_fault
0.68 ± 3% +0.1 0.74 perf-profile.calltrace.cycles-pp.vma_alloc_folio_noprof.alloc_anon_folio.do_anonymous_page.__handle_mm_fault.handle_mm_fault
0.84 ± 4% +0.1 0.92 perf-profile.calltrace.cycles-pp.unmap_page_range.zap_page_range_single_batched.madvise_dontneed_free.madvise_vma_behavior.madvise_do_behavior
0.94 ± 4% +0.1 1.03 perf-profile.calltrace.cycles-pp.zap_page_range_single_batched.madvise_dontneed_free.madvise_vma_behavior.madvise_do_behavior.do_madvise
1.00 ± 4% +0.1 1.10 perf-profile.calltrace.cycles-pp.madvise_dontneed_free.madvise_vma_behavior.madvise_do_behavior.do_madvise.__x64_sys_madvise
0.92 ± 3% +0.1 1.02 perf-profile.calltrace.cycles-pp.alloc_anon_folio.do_anonymous_page.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
1.06 ± 4% +0.1 1.16 perf-profile.calltrace.cycles-pp.madvise_vma_behavior.madvise_do_behavior.do_madvise.__x64_sys_madvise.do_syscall_64
1.48 ± 4% +0.2 1.64 perf-profile.calltrace.cycles-pp.madvise_do_behavior.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.26 ±100% +0.3 0.55 perf-profile.calltrace.cycles-pp.lock_vma_under_rcu.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
0.60 ± 6% +0.3 0.95 perf-profile.calltrace.cycles-pp.flush_tlb_func.flush_tlb_mm_range.tlb_finish_mmu.do_madvise.__x64_sys_madvise
0.77 ± 5% +0.4 1.14 perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.tlb_finish_mmu.do_madvise.__x64_sys_madvise.do_syscall_64
0.08 ±223% +0.5 0.59 perf-profile.calltrace.cycles-pp.native_flush_tlb_one_user.flush_tlb_func.flush_tlb_mm_range.tlb_finish_mmu.do_madvise
32.13 ± 2% +0.8 32.98 perf-profile.calltrace.cycles-pp.__madvise
58.20 -1.6 56.57 perf-profile.children.cycles-pp.folio_batch_move_lru
58.30 -1.6 56.68 perf-profile.children.cycles-pp.folio_add_lru
59.98 -1.5 58.50 perf-profile.children.cycles-pp.do_anonymous_page
60.60 -1.4 59.18 perf-profile.children.cycles-pp.__handle_mm_fault
61.12 -1.4 59.73 perf-profile.children.cycles-pp.handle_mm_fault
62.19 -1.3 60.90 perf-profile.children.cycles-pp.do_user_addr_fault
62.28 -1.3 61.01 perf-profile.children.cycles-pp.exc_page_fault
67.07 -0.9 66.15 perf-profile.children.cycles-pp.asm_exc_page_fault
0.14 ± 5% -0.1 0.09 ± 4% perf-profile.children.cycles-pp.css_rstat_updated
0.11 ± 3% -0.0 0.10 ± 4% perf-profile.children.cycles-pp.record__mmap_read_evlist
0.12 ± 4% -0.0 0.10 ± 6% perf-profile.children.cycles-pp.handle_internal_command
0.12 ± 4% -0.0 0.10 ± 6% perf-profile.children.cycles-pp.main
0.12 ± 4% -0.0 0.10 ± 6% perf-profile.children.cycles-pp.run_builtin
0.12 ± 4% -0.0 0.10 ± 6% perf-profile.children.cycles-pp.__cmd_record
0.12 ± 4% -0.0 0.10 ± 6% perf-profile.children.cycles-pp.cmd_record
0.11 -0.0 0.10 ± 4% perf-profile.children.cycles-pp.perf_mmap__push
0.10 ± 3% +0.0 0.11 perf-profile.children.cycles-pp.access_error
0.11 ± 3% +0.0 0.12 perf-profile.children.cycles-pp.update_process_times
0.16 ± 5% +0.0 0.18 ± 2% perf-profile.children.cycles-pp.clear_page_erms
0.19 ± 3% +0.0 0.22 ± 3% perf-profile.children.cycles-pp.__mem_cgroup_charge
0.30 ± 5% +0.0 0.34 perf-profile.children.cycles-pp.find_vma_prev
0.45 ± 4% +0.0 0.49 perf-profile.children.cycles-pp.get_page_from_freelist
0.26 ± 9% +0.0 0.31 perf-profile.children.cycles-pp.lru_gen_add_folio
0.41 ± 3% +0.0 0.45 perf-profile.children.cycles-pp.mas_walk
0.00 +0.1 0.05 perf-profile.children.cycles-pp.get_vma_policy
0.52 ± 4% +0.1 0.57 perf-profile.children.cycles-pp.lock_vma_under_rcu
0.56 ± 4% +0.1 0.61 perf-profile.children.cycles-pp.__alloc_frozen_pages_noprof
0.63 ± 3% +0.1 0.69 perf-profile.children.cycles-pp.alloc_pages_mpol
0.68 ± 3% +0.1 0.75 perf-profile.children.cycles-pp.vma_alloc_folio_noprof
0.38 ± 9% +0.1 0.44 perf-profile.children.cycles-pp.lru_add
0.96 ± 4% +0.1 1.04 perf-profile.children.cycles-pp.unmap_page_range
0.94 ± 4% +0.1 1.04 perf-profile.children.cycles-pp.zap_page_range_single_batched
0.94 ± 4% +0.1 1.04 perf-profile.children.cycles-pp.alloc_anon_folio
1.01 ± 4% +0.1 1.11 perf-profile.children.cycles-pp.madvise_dontneed_free
1.06 ± 4% +0.1 1.17 perf-profile.children.cycles-pp.madvise_vma_behavior
0.01 ±223% +0.1 0.12 perf-profile.children.cycles-pp.mm_needs_global_asid
0.47 ± 5% +0.1 0.59 perf-profile.children.cycles-pp.native_flush_tlb_one_user
1.49 ± 4% +0.2 1.64 perf-profile.children.cycles-pp.madvise_do_behavior
0.62 ± 5% +0.4 0.97 perf-profile.children.cycles-pp.flush_tlb_func
0.79 ± 6% +0.4 1.17 perf-profile.children.cycles-pp.flush_tlb_mm_range
32.30 ± 2% +0.9 33.16 perf-profile.children.cycles-pp.__madvise
0.21 ± 10% -0.1 0.09 ± 5% perf-profile.self.cycles-pp.free_pages_and_swap_cache
0.13 ± 6% -0.1 0.07 ± 10% perf-profile.self.cycles-pp.css_rstat_updated
0.05 +0.0 0.06 perf-profile.self.cycles-pp._raw_spin_trylock
0.05 +0.0 0.06 perf-profile.self.cycles-pp.zap_page_range_single_batched
0.06 +0.0 0.07 perf-profile.self.cycles-pp.find_vma_prev
0.07 ± 6% +0.0 0.09 ± 4% perf-profile.self.cycles-pp.folio_batch_move_lru
0.14 ± 4% +0.0 0.16 ± 3% perf-profile.self.cycles-pp.unmap_page_range
0.12 ± 6% +0.0 0.14 perf-profile.self.cycles-pp.flush_tlb_mm_range
0.10 ± 8% +0.0 0.12 ± 3% perf-profile.self.cycles-pp.lru_add
0.21 ± 3% +0.0 0.23 ± 2% perf-profile.self.cycles-pp.handle_mm_fault
0.18 ± 9% +0.0 0.21 perf-profile.self.cycles-pp.lru_gen_add_folio
0.39 ± 3% +0.0 0.44 perf-profile.self.cycles-pp.mas_walk
0.00 +0.1 0.12 ± 4% perf-profile.self.cycles-pp.mm_needs_global_asid
0.46 ± 5% +0.1 0.58 perf-profile.self.cycles-pp.native_flush_tlb_one_user
0.11 ± 7% +0.2 0.27 perf-profile.self.cycles-pp.flush_tlb_func
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
reply other threads:[~2025-07-31 7:32 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=202507310831.cf3e212e-lkp@intel.com \
--to=oliver.sang@intel.com \
--cc=cgroups@vger.kernel.org \
--cc=inwardvessel@gmail.com \
--cc=lkp@intel.com \
--cc=oe-lkp@lists.linux.dev \
--cc=shakeel.butt@linux.dev \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).