From: kernel test robot <oliver.sang@intel.com>
To: Mateusz Guzik <mjguzik@gmail.com>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
Christian Brauner <brauner@kernel.org>,
Oleg Nesterov <oleg@redhat.com>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
<linux-kernel@vger.kernel.org>, <oliver.sang@intel.com>
Subject: [linux-next:master] [pid] 7903f907a2: stress-ng.pthread.ops_per_sec 23.4% improvement
Date: Wed, 19 Feb 2025 13:46:21 +0800 [thread overview]
Message-ID: <202502191317.d0050992-lkp@intel.com> (raw)
Hello,
kernel test robot noticed a 23.4% improvement of stress-ng.pthread.ops_per_sec on:
commit: 7903f907a226058ed99f86e9924e082aea57fc45 ("pid: perform free_pid() calls outside of tasklist_lock")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory
parameters:
nr_threads: 100%
testtime: 60s
test: pthread
cpufreq_governor: performance
In addition to that, the commit also has significant impact on the following tests:
+------------------+---------------------------------------------------------------------------------------------+
| testcase: change | stress-ng: stress-ng.vfork.ops_per_sec 28.7% improvement |
| test machine | 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 512G memory |
| test parameters | cpufreq_governor=performance |
| | nr_threads=100% |
| | test=vfork |
| | testtime=60s |
+------------------+---------------------------------------------------------------------------------------------+
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250219/202502191317.d0050992-lkp@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-12/performance/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-spr-r02/pthread/stress-ng/60s
commit:
74198dc206 ("pid: sprinkle tasklist_lock asserts")
7903f907a2 ("pid: perform free_pid() calls outside of tasklist_lock")
74198dc2067b2aa1 7903f907a226058ed99f86e9924
---------------- ---------------------------
%stddev %change %stddev
\ | \
5.953e+08 ± 9% +82.9% 1.089e+09 ± 3% cpuidle..time
3067781 ± 17% +281.8% 11714061 ± 4% cpuidle..usage
3156621 ± 7% -11.8% 2783051 ± 7% numa-meminfo.node0.AnonPages
315502 ± 4% -11.0% 280901 ± 4% numa-meminfo.node1.PageTables
2119 ± 4% -59.4% 861.38 ± 3% vmstat.procs.r
695158 +37.7% 957064 vmstat.system.cs
786439 +58.8% 1248633 vmstat.system.in
918265 -31.9% 625741 ± 31% meminfo.AnonHugePages
9498433 ± 3% +13.6% 10786868 ± 3% meminfo.Cached
1.188e+09 -11.7% 1.049e+09 meminfo.Committed_AS
5970512 ± 6% +21.6% 7258946 ± 4% meminfo.Shmem
4.38 ± 11% +3.8 8.20 ± 3% mpstat.cpu.all.idle%
0.47 +0.2 0.67 mpstat.cpu.all.irq%
0.37 ± 6% +0.4 0.76 ± 5% mpstat.cpu.all.soft%
1.47 +0.3 1.82 mpstat.cpu.all.usr%
39409396 +21.1% 47737561 ± 2% numa-numastat.node0.local_node
39517687 +21.1% 47862366 ± 2% numa-numastat.node0.numa_hit
39678016 +22.2% 48499008 ± 2% numa-numastat.node1.local_node
39806349 +22.1% 48619579 ± 2% numa-numastat.node1.numa_hit
11111 ± 20% +86.8% 20750 ± 10% perf-c2c.DRAM.local
8594 ± 16% +25.6% 10797 ± 7% perf-c2c.DRAM.remote
14151 ± 18% +100.2% 28336 ± 9% perf-c2c.HITM.local
3853 ± 16% +40.3% 5404 ± 7% perf-c2c.HITM.remote
18004 ± 18% +87.4% 33740 ± 9% perf-c2c.HITM.total
785387 ± 8% -10.5% 702556 ± 7% numa-vmstat.node0.nr_anon_pages
39519842 +20.9% 47789798 ± 2% numa-vmstat.node0.numa_hit
39411551 +20.9% 47665001 ± 2% numa-vmstat.node0.numa_local
78603 ± 3% -9.8% 70878 ± 5% numa-vmstat.node1.nr_page_table_pages
39804028 +22.0% 48541084 ± 2% numa-vmstat.node1.numa_hit
39675696 +22.0% 48420524 ± 2% numa-vmstat.node1.numa_local
304344 ± 7% -66.2% 102730 ± 5% stress-ng.pthread.nanosecs_to_start_a_pthread
10003318 +23.2% 12323193 stress-ng.pthread.ops
166143 +23.4% 204943 stress-ng.pthread.ops_per_sec
4793153 +19.3% 5716581 stress-ng.time.involuntary_context_switches
21587233 +23.1% 26564025 stress-ng.time.minor_page_faults
13184 +11.2% 14659 stress-ng.time.percent_of_cpu_this_job_got
7880 +10.4% 8702 stress-ng.time.system_time
105.74 +51.1% 159.78 stress-ng.time.user_time
23363531 +24.5% 29091883 stress-ng.time.voluntary_context_switches
3104817 ± 2% +7.0% 3322678 ± 2% proc-vmstat.nr_active_anon
1610889 -6.3% 1509476 ± 3% proc-vmstat.nr_anon_pages
447.53 -31.7% 305.57 ± 31% proc-vmstat.nr_anon_transparent_hugepages
2380189 ± 3% +13.4% 2699415 ± 3% proc-vmstat.nr_file_pages
1794253 -3.7% 1727492 proc-vmstat.nr_kernel_stack
154819 -9.1% 140710 proc-vmstat.nr_page_table_pages
1498207 ± 5% +21.3% 1817432 ± 4% proc-vmstat.nr_shmem
47516 +2.5% 48728 proc-vmstat.nr_slab_reclaimable
3104817 ± 2% +7.0% 3322678 ± 2% proc-vmstat.nr_zone_active_anon
550885 ± 15% +69.4% 932960 ± 11% proc-vmstat.numa_hint_faults
293967 ± 27% +95.8% 575443 ± 19% proc-vmstat.numa_hint_faults_local
79375488 +21.6% 96482937 proc-vmstat.numa_hit
79138861 +21.6% 96237560 proc-vmstat.numa_local
330580 ± 9% +27.1% 420192 ± 5% proc-vmstat.numa_pages_migrated
808808 ± 11% +43.0% 1156712 ± 9% proc-vmstat.numa_pte_updates
83384617 +26.0% 1.05e+08 proc-vmstat.pgalloc_normal
22326472 +22.9% 27448052 proc-vmstat.pgfault
80530234 +26.2% 1.017e+08 proc-vmstat.pgfree
330580 ± 9% +27.1% 420192 ± 5% proc-vmstat.pgmigrate_success
261994 ± 8% +39.8% 366207 ± 7% proc-vmstat.pgreuse
4612194 ± 2% +62.7% 7503881 sched_debug.cfs_rq:/.avg_vruntime.avg
5440180 ± 13% +85.6% 10099394 ± 2% sched_debug.cfs_rq:/.avg_vruntime.max
501155 ± 64% +329.5% 2152678 ± 6% sched_debug.cfs_rq:/.avg_vruntime.stddev
2.13 ± 9% -47.3% 1.12 ± 18% sched_debug.cfs_rq:/.h_nr_queued.avg
44.33 ± 10% -55.6% 19.67 ± 47% sched_debug.cfs_rq:/.h_nr_queued.max
5.09 ± 5% -53.8% 2.35 ± 26% sched_debug.cfs_rq:/.h_nr_queued.stddev
2.09 ± 9% -47.9% 1.09 ± 19% sched_debug.cfs_rq:/.h_nr_runnable.avg
44.25 ± 10% -55.7% 19.58 ± 47% sched_debug.cfs_rq:/.h_nr_runnable.max
5.05 ± 5% -54.2% 2.31 ± 27% sched_debug.cfs_rq:/.h_nr_runnable.stddev
5340703 ± 12% +85.8% 9925031 ± 2% sched_debug.cfs_rq:/.left_deadline.max
2202572 ± 2% +55.2% 3417743 ± 9% sched_debug.cfs_rq:/.left_deadline.stddev
5340659 ± 12% +85.8% 9924585 ± 2% sched_debug.cfs_rq:/.left_vruntime.max
2202531 ± 2% +55.2% 3417686 ± 9% sched_debug.cfs_rq:/.left_vruntime.stddev
313473 ± 6% -24.8% 235882 ± 22% sched_debug.cfs_rq:/.load.avg
4612199 ± 2% +62.7% 7503887 sched_debug.cfs_rq:/.min_vruntime.avg
5440184 ± 13% +85.6% 10099394 ± 2% sched_debug.cfs_rq:/.min_vruntime.max
501154 ± 64% +329.5% 2152680 ± 6% sched_debug.cfs_rq:/.min_vruntime.stddev
0.60 ± 6% -19.5% 0.49 ± 13% sched_debug.cfs_rq:/.nr_queued.avg
5340667 ± 12% +85.8% 9924585 ± 2% sched_debug.cfs_rq:/.right_vruntime.max
2202534 ± 2% +55.2% 3417691 ± 9% sched_debug.cfs_rq:/.right_vruntime.stddev
364.26 ± 3% +16.6% 424.72 ± 2% sched_debug.cfs_rq:/.util_avg.avg
1206 ± 23% +53.8% 1856 ± 26% sched_debug.cfs_rq:/.util_est.max
209.57 ± 9% +27.9% 268.09 ± 11% sched_debug.cfs_rq:/.util_est.stddev
360185 ± 5% +68.1% 605388 ± 15% sched_debug.cpu.curr->pid.avg
401600 ± 3% +120.0% 883327 ± 5% sched_debug.cpu.curr->pid.stddev
2.13 ± 10% -47.0% 1.13 ± 18% sched_debug.cpu.nr_running.avg
44.25 ± 10% -55.6% 19.67 ± 47% sched_debug.cpu.nr_running.max
5.08 ± 5% -53.8% 2.35 ± 25% sched_debug.cpu.nr_running.stddev
98005 +37.5% 134753 sched_debug.cpu.nr_switches.avg
178454 ± 8% +106.9% 369189 ± 4% sched_debug.cpu.nr_switches.max
16050 ± 34% +376.0% 76393 ± 3% sched_debug.cpu.nr_switches.stddev
3.76 +13.7% 4.27 perf-stat.i.MPKI
1.873e+10 +6.2% 1.989e+10 perf-stat.i.branch-instructions
0.61 +0.1 0.69 perf-stat.i.branch-miss-rate%
1.096e+08 +21.8% 1.335e+08 perf-stat.i.branch-misses
40.32 -2.7 37.62 perf-stat.i.cache-miss-rate%
3.087e+08 +22.7% 3.787e+08 perf-stat.i.cache-misses
7.635e+08 +31.5% 1.004e+09 perf-stat.i.cache-references
712864 +38.1% 984398 perf-stat.i.context-switches
7.63 -10.6% 6.82 perf-stat.i.cpi
6.279e+11 -3.7% 6.047e+11 perf-stat.i.cpu-cycles
2027 -21.4% 1593 perf-stat.i.cycles-between-cache-misses
8.232e+10 +7.9% 8.881e+10 perf-stat.i.instructions
0.14 +10.8% 0.15 perf-stat.i.ipc
8.13 +26.5% 10.29 perf-stat.i.metric.K/sec
369735 +22.0% 450981 perf-stat.i.minor-faults
532034 +22.5% 651748 perf-stat.i.page-faults
3.76 +13.3% 4.26 perf-stat.overall.MPKI
0.58 +0.1 0.67 perf-stat.overall.branch-miss-rate%
40.43 -2.7 37.76 perf-stat.overall.cache-miss-rate%
7.66 -11.4% 6.79 perf-stat.overall.cpi
2038 -21.8% 1594 perf-stat.overall.cycles-between-cache-misses
0.13 +12.8% 0.15 perf-stat.overall.ipc
1.821e+10 +7.3% 1.954e+10 perf-stat.ps.branch-instructions
1.057e+08 +23.2% 1.302e+08 perf-stat.ps.branch-misses
3.007e+08 +23.6% 3.717e+08 perf-stat.ps.cache-misses
7.438e+08 +32.4% 9.845e+08 perf-stat.ps.cache-references
696299 +38.7% 965478 perf-stat.ps.context-switches
6.131e+11 -3.4% 5.925e+11 perf-stat.ps.cpu-cycles
8e+10 +9.0% 8.724e+10 perf-stat.ps.instructions
356195 +23.6% 440270 perf-stat.ps.minor-faults
514755 +23.8% 637135 perf-stat.ps.page-faults
4.867e+12 +9.3% 5.319e+12 perf-stat.total.instructions
74.42 ± 44% -60.3 14.16 ±223% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
74.41 ± 44% -60.3 14.16 ±223% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
46.44 ± 44% -41.7 4.73 ±223% perf-profile.calltrace.cycles-pp.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe
46.44 ± 44% -41.7 4.73 ±223% perf-profile.calltrace.cycles-pp.__x64_sys_exit.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe
46.43 ± 44% -41.7 4.72 ±223% perf-profile.calltrace.cycles-pp.do_exit.__x64_sys_exit.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe
45.72 ± 44% -41.2 4.50 ±223% perf-profile.calltrace.cycles-pp.exit_notify.do_exit.__x64_sys_exit.x64_sys_call.do_syscall_64
23.46 ± 44% -23.5 0.00 perf-profile.calltrace.cycles-pp.queued_write_lock_slowpath.exit_notify.do_exit.__x64_sys_exit.x64_sys_call
23.34 ± 44% -23.3 0.00 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath.queued_write_lock_slowpath.exit_notify.do_exit.__x64_sys_exit
23.33 ± 45% -23.3 0.00 perf-profile.calltrace.cycles-pp.queued_write_lock_slowpath.copy_process.kernel_clone.__do_sys_clone3.do_syscall_64
23.24 ± 45% -23.2 0.00 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath.queued_write_lock_slowpath.copy_process.kernel_clone.__do_sys_clone3
21.68 ± 44% -21.7 0.00 perf-profile.calltrace.cycles-pp.queued_write_lock_slowpath.release_task.exit_notify.do_exit.__x64_sys_exit
21.54 ± 44% -21.5 0.00 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath.queued_write_lock_slowpath.release_task.exit_notify.do_exit
27.26 ± 45% -18.0 9.26 ±223% perf-profile.calltrace.cycles-pp.__do_sys_clone3.do_syscall_64.entry_SYSCALL_64_after_hwframe
27.26 ± 45% -18.0 9.26 ±223% perf-profile.calltrace.cycles-pp.kernel_clone.__do_sys_clone3.do_syscall_64.entry_SYSCALL_64_after_hwframe
22.09 ± 44% -17.6 4.45 ±223% perf-profile.calltrace.cycles-pp.release_task.exit_notify.do_exit.__x64_sys_exit.x64_sys_call
26.16 ± 45% -17.2 8.99 ±223% perf-profile.calltrace.cycles-pp.copy_process.kernel_clone.__do_sys_clone3.do_syscall_64.entry_SYSCALL_64_after_hwframe
5.24 ± 47% -3.9 1.38 ±223% perf-profile.calltrace.cycles-pp.__madvise
5.24 ± 47% -3.9 1.38 ±223% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
5.24 ± 47% -3.9 1.38 ±223% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__madvise
5.24 ± 47% -3.9 1.38 ±223% perf-profile.calltrace.cycles-pp.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
5.24 ± 47% -3.9 1.38 ±223% perf-profile.calltrace.cycles-pp.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
5.18 ± 47% -3.8 1.37 ±223% perf-profile.calltrace.cycles-pp.madvise_vma_behavior.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe
5.18 ± 47% -3.8 1.36 ±223% perf-profile.calltrace.cycles-pp.zap_page_range_single.madvise_vma_behavior.do_madvise.__x64_sys_madvise.do_syscall_64
5.08 ± 47% -3.7 1.34 ±223% perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.tlb_finish_mmu.zap_page_range_single.madvise_vma_behavior.do_madvise
5.08 ± 47% -3.7 1.34 ±223% perf-profile.calltrace.cycles-pp.tlb_finish_mmu.zap_page_range_single.madvise_vma_behavior.do_madvise.__x64_sys_madvise
5.07 ± 47% -3.7 1.34 ±223% perf-profile.calltrace.cycles-pp.on_each_cpu_cond_mask.flush_tlb_mm_range.tlb_finish_mmu.zap_page_range_single.madvise_vma_behavior
5.06 ± 47% -3.7 1.33 ±223% perf-profile.calltrace.cycles-pp.smp_call_function_many_cond.on_each_cpu_cond_mask.flush_tlb_mm_range.tlb_finish_mmu.zap_page_range_single
68.48 ± 44% -68.4 0.09 ±223% perf-profile.children.cycles-pp.queued_write_lock_slowpath
81.41 ± 44% -65.4 16.02 ±223% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
81.40 ± 44% -65.4 16.01 ±223% perf-profile.children.cycles-pp.do_syscall_64
70.40 ± 44% -57.1 13.32 ±223% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
46.45 ± 44% -41.7 4.73 ±223% perf-profile.children.cycles-pp.x64_sys_call
46.44 ± 44% -41.7 4.73 ±223% perf-profile.children.cycles-pp.do_exit
46.44 ± 44% -41.7 4.73 ±223% perf-profile.children.cycles-pp.__x64_sys_exit
45.74 ± 44% -41.2 4.50 ±223% perf-profile.children.cycles-pp.exit_notify
27.26 ± 45% -18.0 9.26 ±223% perf-profile.children.cycles-pp.__do_sys_clone3
27.26 ± 45% -18.0 9.26 ±223% perf-profile.children.cycles-pp.kernel_clone
22.11 ± 44% -17.7 4.45 ±223% perf-profile.children.cycles-pp.release_task
26.18 ± 45% -17.2 8.99 ±223% perf-profile.children.cycles-pp.copy_process
5.38 ± 47% -4.0 1.38 ±223% perf-profile.children.cycles-pp.tlb_finish_mmu
5.30 ± 47% -3.9 1.36 ±223% perf-profile.children.cycles-pp.on_each_cpu_cond_mask
5.30 ± 47% -3.9 1.36 ±223% perf-profile.children.cycles-pp.smp_call_function_many_cond
5.30 ± 47% -3.9 1.37 ±223% perf-profile.children.cycles-pp.flush_tlb_mm_range
5.25 ± 47% -3.9 1.38 ±223% perf-profile.children.cycles-pp.__madvise
5.24 ± 47% -3.9 1.38 ±223% perf-profile.children.cycles-pp.__x64_sys_madvise
5.24 ± 47% -3.9 1.38 ±223% perf-profile.children.cycles-pp.do_madvise
5.18 ± 47% -3.8 1.37 ±223% perf-profile.children.cycles-pp.madvise_vma_behavior
5.18 ± 47% -3.8 1.36 ±223% perf-profile.children.cycles-pp.zap_page_range_single
70.39 ± 44% -57.1 13.32 ±223% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
5.16 ± 47% -3.9 1.30 ±223% perf-profile.self.cycles-pp.smp_call_function_many_cond
***************************************************************************************************
lkp-spr-2sp4: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 512G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-12/performance/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-spr-2sp4/vfork/stress-ng/60s
commit:
74198dc206 ("pid: sprinkle tasklist_lock asserts")
7903f907a2 ("pid: perform free_pid() calls outside of tasklist_lock")
74198dc2067b2aa1 7903f907a226058ed99f86e9924
---------------- ---------------------------
%stddev %change %stddev
\ | \
6562366 ± 8% +37.0% 8993652 ± 10% cpuidle..usage
0.29 +0.1 0.39 mpstat.cpu.all.soft%
486692 +31.8% 641303 vmstat.system.cs
506323 +4.8% 530409 vmstat.system.in
4004574 ± 3% +8.7% 4353640 ± 3% meminfo.Active
4004574 ± 3% +8.7% 4353640 ± 3% meminfo.Active(anon)
2657761 ± 6% +15.5% 3069404 ± 5% meminfo.Shmem
3257759 ± 11% +14.3% 3724594 ± 7% numa-meminfo.node1.Active
3257759 ± 11% +14.3% 3724594 ± 7% numa-meminfo.node1.Active(anon)
2492828 ± 9% +21.0% 3017306 ± 6% numa-meminfo.node1.Shmem
9063611 ± 2% +36.5% 12368884 ± 9% numa-numastat.node0.local_node
9220375 ± 2% +35.7% 12513653 ± 9% numa-numastat.node0.numa_hit
10168176 +28.3% 13044773 numa-numastat.node1.local_node
10243149 +28.2% 13131946 numa-numastat.node1.numa_hit
5700 ± 8% +47.9% 8432 ± 11% perf-c2c.DRAM.remote
14297 ± 7% +42.5% 20373 ± 12% perf-c2c.HITM.local
3624 ± 8% +54.4% 5597 ± 11% perf-c2c.HITM.remote
17922 ± 7% +44.9% 25970 ± 12% perf-c2c.HITM.total
51838 ± 45% -56.5% 22543 ±105% numa-vmstat.node0.nr_mapped
9221619 ± 2% +35.2% 12469913 ± 9% numa-vmstat.node0.numa_hit
9064856 ± 2% +36.0% 12325144 ± 10% numa-vmstat.node0.numa_local
623443 ± 9% +20.6% 752138 ± 6% numa-vmstat.node1.nr_shmem
10243633 +27.8% 13088671 numa-vmstat.node1.numa_hit
10168660 +27.9% 13001498 numa-vmstat.node1.numa_local
1378378 +18.3% 1630343 stress-ng.time.involuntary_context_switches
10647 -3.1% 10321 stress-ng.time.system_time
1838 +13.8% 2092 stress-ng.time.user_time
16431508 +30.8% 21498222 stress-ng.time.voluntary_context_switches
8890752 +28.7% 11442483 stress-ng.vfork.ops
148177 +28.7% 190706 stress-ng.vfork.ops_per_sec
1000826 ± 3% +8.9% 1090125 ± 3% proc-vmstat.nr_active_anon
1545626 ± 2% +6.8% 1650840 ± 2% proc-vmstat.nr_file_pages
120475 +2.9% 124024 proc-vmstat.nr_mapped
663632 ± 6% +15.9% 768846 ± 5% proc-vmstat.nr_shmem
1000826 ± 3% +8.9% 1090125 ± 3% proc-vmstat.nr_zone_active_anon
19510114 +31.5% 25647538 ± 4% proc-vmstat.numa_hit
19278378 +31.8% 25415597 ± 4% proc-vmstat.numa_local
22280233 +32.9% 29608930 ± 4% proc-vmstat.pgalloc_normal
20644303 +35.1% 27885848 ± 4% proc-vmstat.pgfree
1.03 +18.9% 1.22 ± 2% perf-stat.i.MPKI
1.703e+10 +6.2% 1.809e+10 perf-stat.i.branch-instructions
0.53 ± 2% +0.1 0.59 ± 4% perf-stat.i.branch-miss-rate%
88001361 ± 3% +17.3% 1.032e+08 ± 5% perf-stat.i.branch-misses
74412375 +27.9% 95182974 perf-stat.i.cache-misses
7.674e+08 ± 3% +26.4% 9.698e+08 ± 4% perf-stat.i.cache-references
503132 +32.0% 664329 perf-stat.i.context-switches
8.49 -7.5% 7.85 perf-stat.i.cpi
112807 ± 2% +23.7% 139583 ± 5% perf-stat.i.cpu-migrations
8617 -23.1% 6627 perf-stat.i.cycles-between-cache-misses
7.368e+10 +7.4% 7.917e+10 perf-stat.i.instructions
0.12 +8.3% 0.13 perf-stat.i.ipc
2.25 +31.7% 2.97 perf-stat.i.metric.K/sec
1.02 +18.9% 1.21 perf-stat.overall.MPKI
0.50 ± 2% +0.1 0.56 ± 3% perf-stat.overall.branch-miss-rate%
8.55 -7.5% 7.91 perf-stat.overall.cpi
8374 -22.2% 6517 perf-stat.overall.cycles-between-cache-misses
0.12 +8.1% 0.13 perf-stat.overall.ipc
1.655e+10 +6.2% 1.758e+10 perf-stat.ps.branch-instructions
82996740 ± 3% +17.8% 97762479 ± 5% perf-stat.ps.branch-misses
73065238 +27.7% 93297913 perf-stat.ps.cache-misses
7.509e+08 ± 3% +26.3% 9.487e+08 ± 4% perf-stat.ps.cache-references
491567 +32.0% 649035 perf-stat.ps.context-switches
110242 ± 2% +23.6% 136250 ± 4% perf-stat.ps.cpu-migrations
7.159e+10 +7.4% 7.69e+10 perf-stat.ps.instructions
11850 ± 2% +6.0% 12559 ± 3% perf-stat.ps.minor-faults
11850 ± 2% +6.0% 12559 ± 3% perf-stat.ps.page-faults
4.334e+12 +8.1% 4.684e+12 perf-stat.total.instructions
0.55 ± 10% -29.3% 0.39 ± 13% perf-sched.sch_delay.avg.ms.__cond_resched.__kmalloc_cache_node_noprof.__get_vm_area_node.__vmalloc_node_range_noprof.alloc_thread_stack_node
0.80 ± 3% -31.4% 0.55 ± 6% perf-sched.sch_delay.avg.ms.__cond_resched.__kmalloc_node_noprof.__vmalloc_area_node.__vmalloc_node_range_noprof.alloc_thread_stack_node
0.94 ± 3% -31.1% 0.65 ± 2% perf-sched.sch_delay.avg.ms.__cond_resched.__vmalloc_area_node.__vmalloc_node_range_noprof.alloc_thread_stack_node.dup_task_struct
0.30 ± 2% -14.5% 0.26 ± 4% perf-sched.sch_delay.avg.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
0.37 -28.9% 0.27 perf-sched.sch_delay.avg.ms.__cond_resched.__wait_for_common.wait_for_completion_state.kernel_clone.__x64_sys_vfork
0.81 ± 12% -28.8% 0.58 ± 10% perf-sched.sch_delay.avg.ms.__cond_resched.alloc_pages_bulk_noprof.__vmalloc_area_node.__vmalloc_node_range_noprof.alloc_thread_stack_node
0.76 ± 4% -43.4% 0.43 ± 3% perf-sched.sch_delay.avg.ms.__cond_resched.cgroup_css_set_fork.cgroup_can_fork.copy_process.kernel_clone
0.42 ± 16% -45.4% 0.23 ± 15% perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_node_noprof.dup_task_struct.copy_process.kernel_clone
0.81 -38.6% 0.50 ± 5% perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.alloc_pid.copy_process.kernel_clone
0.92 -31.7% 0.63 ± 8% perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.copy_fs_struct.copy_process.kernel_clone
0.87 ± 3% -33.4% 0.58 ± 8% perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.copy_sighand.copy_process.kernel_clone
0.86 ± 8% -32.5% 0.58 ± 7% perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.copy_signal.copy_process.kernel_clone
0.96 ± 5% -36.0% 0.61 ± 4% perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.dup_fd.copy_process.kernel_clone
0.85 -38.0% 0.53 ± 3% perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.prepare_creds.copy_creds.copy_process
0.34 ± 33% -57.1% 0.15 ± 82% perf-sched.sch_delay.avg.ms.__cond_resched.kvfree_rcu_drain_ready.kfree_rcu_monitor.process_one_work.worker_thread
0.04 ± 3% -20.9% 0.04 ± 6% perf-sched.sch_delay.avg.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.17 ± 9% -31.5% 0.11 ± 16% perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
0.23 -18.1% 0.19 perf-sched.sch_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
0.30 -20.7% 0.24 ± 2% perf-sched.sch_delay.avg.ms.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
0.10 ± 6% -18.2% 0.08 ± 5% perf-sched.sch_delay.avg.ms.syscall_exit_to_user_mode.ret_from_fork_asm.[unknown].[unknown]
0.13 -18.4% 0.11 ± 2% perf-sched.sch_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
1.64 ± 33% -34.6% 1.07 ± 20% perf-sched.sch_delay.max.ms.__cond_resched.kmem_cache_alloc_node_noprof.dup_task_struct.copy_process.kernel_clone
0.43 ± 28% -41.7% 0.25 ± 31% perf-sched.sch_delay.max.ms.__cond_resched.mmput.exit_mm.do_exit.__x64_sys_exit
0.78 ± 19% -42.2% 0.45 ± 25% perf-sched.sch_delay.max.ms.__cond_resched.stop_one_cpu.sched_exec.bprm_execve.part
0.13 -20.3% 0.10 perf-sched.total_sch_delay.average.ms
59.45 ± 12% -21.3% 46.77 ± 9% perf-sched.total_sch_delay.max.ms
2.32 -18.5% 1.89 perf-sched.total_wait_and_delay.average.ms
1656374 +26.0% 2087010 perf-sched.total_wait_and_delay.count.ms
2.20 -18.4% 1.79 perf-sched.total_wait_time.average.ms
0.90 -26.7% 0.66 perf-sched.wait_and_delay.avg.ms.__cond_resched.__wait_for_common.wait_for_completion_state.kernel_clone.__x64_sys_vfork
20.62 ± 6% -43.0% 11.74 ± 2% perf-sched.wait_and_delay.avg.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
0.17 ± 2% -18.4% 0.14 ± 5% perf-sched.wait_and_delay.avg.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
60.43 ± 19% +76.4% 106.62 ± 33% perf-sched.wait_and_delay.avg.ms.pipe_read.vfs_read.ksys_read.do_syscall_64
0.65 -18.1% 0.53 perf-sched.wait_and_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
56.03 ± 3% -45.1% 30.75 ± 2% perf-sched.wait_and_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
0.89 ± 3% -17.5% 0.73 ± 7% perf-sched.wait_and_delay.avg.ms.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
10.82 -15.3% 9.17 perf-sched.wait_and_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
33654 -9.5% 30471 perf-sched.wait_and_delay.count.__cond_resched.__wait_for_common.wait_for_completion_state.kernel_clone.__x64_sys_vfork
1689 ± 8% +168.2% 4529 ± 8% perf-sched.wait_and_delay.count.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
59.50 ± 6% +39.5% 83.00 ± 11% perf-sched.wait_and_delay.count.__cond_resched.vunmap_p4d_range.__vunmap_range_noflush.remove_vm_area.vfree
675414 +24.7% 842197 perf-sched.wait_and_delay.count.do_task_dead.do_exit.__x64_sys_exit.x64_sys_call.do_syscall_64
69934 ± 4% +46.4% 102383 ± 6% perf-sched.wait_and_delay.count.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
1118 ± 19% -36.7% 708.00 ± 28% perf-sched.wait_and_delay.count.pipe_read.vfs_read.ksys_read.do_syscall_64
652564 +25.8% 821118 perf-sched.wait_and_delay.count.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
36347 ± 3% +89.4% 68847 ± 2% perf-sched.wait_and_delay.count.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
62439 +16.9% 72971 perf-sched.wait_and_delay.count.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
104431 +18.2% 123395 perf-sched.wait_and_delay.count.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
3.18 ±183% -87.1% 0.41 ± 14% perf-sched.wait_time.avg.ms.__cond_resched.__kmalloc_cache_node_noprof.__get_vm_area_node.__vmalloc_node_range_noprof.alloc_thread_stack_node
0.83 ± 3% -30.1% 0.58 ± 6% perf-sched.wait_time.avg.ms.__cond_resched.__kmalloc_node_noprof.__vmalloc_area_node.__vmalloc_node_range_noprof.alloc_thread_stack_node
1.28 ± 57% -47.5% 0.67 ± 2% perf-sched.wait_time.avg.ms.__cond_resched.__vmalloc_area_node.__vmalloc_node_range_noprof.alloc_thread_stack_node.dup_task_struct
0.52 -25.1% 0.39 perf-sched.wait_time.avg.ms.__cond_resched.__wait_for_common.wait_for_completion_state.kernel_clone.__x64_sys_vfork
0.85 ± 12% -34.9% 0.55 ± 17% perf-sched.wait_time.avg.ms.__cond_resched.alloc_pages_bulk_noprof.__vmalloc_area_node.__vmalloc_node_range_noprof.alloc_thread_stack_node
0.80 ± 5% -37.6% 0.50 perf-sched.wait_time.avg.ms.__cond_resched.cgroup_css_set_fork.cgroup_can_fork.copy_process.kernel_clone
0.79 ± 26% -37.0% 0.50 ± 19% perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_node_noprof.alloc_vmap_area.__get_vm_area_node.__vmalloc_node_range_noprof
0.51 ± 9% -42.1% 0.30 ± 12% perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_node_noprof.dup_task_struct.copy_process.kernel_clone
0.94 -31.8% 0.64 ± 2% perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_noprof.copy_fs_struct.copy_process.kernel_clone
0.90 ± 2% -32.1% 0.61 ± 6% perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_noprof.copy_sighand.copy_process.kernel_clone
0.89 ± 8% -31.4% 0.61 ± 6% perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_noprof.copy_signal.copy_process.kernel_clone
0.96 ± 2% -33.2% 0.64 ± 4% perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_noprof.dup_fd.copy_process.kernel_clone
0.88 -34.6% 0.57 ± 2% perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_noprof.prepare_creds.copy_creds.copy_process
20.58 ± 6% -43.1% 11.71 ± 2% perf-sched.wait_time.avg.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
0.13 ± 3% -17.6% 0.11 ± 4% perf-sched.wait_time.avg.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
60.37 ± 19% +76.5% 106.54 ± 33% perf-sched.wait_time.avg.ms.pipe_read.vfs_read.ksys_read.do_syscall_64
0.41 -17.9% 0.34 perf-sched.wait_time.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
55.91 ± 3% -45.2% 30.65 ± 2% perf-sched.wait_time.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
0.58 ± 6% -15.8% 0.49 ± 11% perf-sched.wait_time.avg.ms.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
10.69 -15.3% 9.06 perf-sched.wait_time.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
1.25 -23.3% 0.96 ± 13% perf-sched.wait_time.max.ms.__cond_resched.alloc_pages_bulk_noprof.__vmalloc_area_node.__vmalloc_node_range_noprof.alloc_thread_stack_node
1.65 ± 34% -34.4% 1.08 ± 19% perf-sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc_node_noprof.dup_task_struct.copy_process.kernel_clone
44.32 ± 19% -26.5% 32.59 ± 11% perf-sched.wait_time.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
reply other threads:[~2025-02-19 5:46 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=202502191317.d0050992-lkp@intel.com \
--to=oliver.sang@intel.com \
--cc=Liam.Howlett@oracle.com \
--cc=brauner@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lkp@intel.com \
--cc=mjguzik@gmail.com \
--cc=oe-lkp@lists.linux.dev \
--cc=oleg@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.