All of lore.kernel.org
 help / color / mirror / Atom feed
* [linux-next:master] [pid]  7903f907a2: stress-ng.pthread.ops_per_sec 23.4% improvement
@ 2025-02-19  5:46 kernel test robot
  0 siblings, 0 replies; only message in thread
From: kernel test robot @ 2025-02-19  5:46 UTC (permalink / raw)
  To: Mateusz Guzik
  Cc: oe-lkp, lkp, Christian Brauner, Oleg Nesterov, Liam R. Howlett,
	linux-kernel, oliver.sang



Hello,

kernel test robot noticed a 23.4% improvement of stress-ng.pthread.ops_per_sec on:


commit: 7903f907a226058ed99f86e9924e082aea57fc45 ("pid: perform free_pid() calls outside of tasklist_lock")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master


testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory
parameters:

	nr_threads: 100%
	testtime: 60s
	test: pthread
	cpufreq_governor: performance


In addition to that, the commit also has significant impact on the following tests:

+------------------+---------------------------------------------------------------------------------------------+
| testcase: change | stress-ng: stress-ng.vfork.ops_per_sec 28.7% improvement                                    |
| test machine     | 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 512G memory |
| test parameters  | cpufreq_governor=performance                                                                |
|                  | nr_threads=100%                                                                             |
|                  | test=vfork                                                                                  |
|                  | testtime=60s                                                                                |
+------------------+---------------------------------------------------------------------------------------------+




Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250219/202502191317.d0050992-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  gcc-12/performance/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-spr-r02/pthread/stress-ng/60s

commit: 
  74198dc206 ("pid: sprinkle tasklist_lock asserts")
  7903f907a2 ("pid: perform free_pid() calls outside of tasklist_lock")

74198dc2067b2aa1 7903f907a226058ed99f86e9924 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
 5.953e+08 ±  9%     +82.9%  1.089e+09 ±  3%  cpuidle..time
   3067781 ± 17%    +281.8%   11714061 ±  4%  cpuidle..usage
   3156621 ±  7%     -11.8%    2783051 ±  7%  numa-meminfo.node0.AnonPages
    315502 ±  4%     -11.0%     280901 ±  4%  numa-meminfo.node1.PageTables
      2119 ±  4%     -59.4%     861.38 ±  3%  vmstat.procs.r
    695158           +37.7%     957064        vmstat.system.cs
    786439           +58.8%    1248633        vmstat.system.in
    918265           -31.9%     625741 ± 31%  meminfo.AnonHugePages
   9498433 ±  3%     +13.6%   10786868 ±  3%  meminfo.Cached
 1.188e+09           -11.7%  1.049e+09        meminfo.Committed_AS
   5970512 ±  6%     +21.6%    7258946 ±  4%  meminfo.Shmem
      4.38 ± 11%      +3.8        8.20 ±  3%  mpstat.cpu.all.idle%
      0.47            +0.2        0.67        mpstat.cpu.all.irq%
      0.37 ±  6%      +0.4        0.76 ±  5%  mpstat.cpu.all.soft%
      1.47            +0.3        1.82        mpstat.cpu.all.usr%
  39409396           +21.1%   47737561 ±  2%  numa-numastat.node0.local_node
  39517687           +21.1%   47862366 ±  2%  numa-numastat.node0.numa_hit
  39678016           +22.2%   48499008 ±  2%  numa-numastat.node1.local_node
  39806349           +22.1%   48619579 ±  2%  numa-numastat.node1.numa_hit
     11111 ± 20%     +86.8%      20750 ± 10%  perf-c2c.DRAM.local
      8594 ± 16%     +25.6%      10797 ±  7%  perf-c2c.DRAM.remote
     14151 ± 18%    +100.2%      28336 ±  9%  perf-c2c.HITM.local
      3853 ± 16%     +40.3%       5404 ±  7%  perf-c2c.HITM.remote
     18004 ± 18%     +87.4%      33740 ±  9%  perf-c2c.HITM.total
    785387 ±  8%     -10.5%     702556 ±  7%  numa-vmstat.node0.nr_anon_pages
  39519842           +20.9%   47789798 ±  2%  numa-vmstat.node0.numa_hit
  39411551           +20.9%   47665001 ±  2%  numa-vmstat.node0.numa_local
     78603 ±  3%      -9.8%      70878 ±  5%  numa-vmstat.node1.nr_page_table_pages
  39804028           +22.0%   48541084 ±  2%  numa-vmstat.node1.numa_hit
  39675696           +22.0%   48420524 ±  2%  numa-vmstat.node1.numa_local
    304344 ±  7%     -66.2%     102730 ±  5%  stress-ng.pthread.nanosecs_to_start_a_pthread
  10003318           +23.2%   12323193        stress-ng.pthread.ops
    166143           +23.4%     204943        stress-ng.pthread.ops_per_sec
   4793153           +19.3%    5716581        stress-ng.time.involuntary_context_switches
  21587233           +23.1%   26564025        stress-ng.time.minor_page_faults
     13184           +11.2%      14659        stress-ng.time.percent_of_cpu_this_job_got
      7880           +10.4%       8702        stress-ng.time.system_time
    105.74           +51.1%     159.78        stress-ng.time.user_time
  23363531           +24.5%   29091883        stress-ng.time.voluntary_context_switches
   3104817 ±  2%      +7.0%    3322678 ±  2%  proc-vmstat.nr_active_anon
   1610889            -6.3%    1509476 ±  3%  proc-vmstat.nr_anon_pages
    447.53           -31.7%     305.57 ± 31%  proc-vmstat.nr_anon_transparent_hugepages
   2380189 ±  3%     +13.4%    2699415 ±  3%  proc-vmstat.nr_file_pages
   1794253            -3.7%    1727492        proc-vmstat.nr_kernel_stack
    154819            -9.1%     140710        proc-vmstat.nr_page_table_pages
   1498207 ±  5%     +21.3%    1817432 ±  4%  proc-vmstat.nr_shmem
     47516            +2.5%      48728        proc-vmstat.nr_slab_reclaimable
   3104817 ±  2%      +7.0%    3322678 ±  2%  proc-vmstat.nr_zone_active_anon
    550885 ± 15%     +69.4%     932960 ± 11%  proc-vmstat.numa_hint_faults
    293967 ± 27%     +95.8%     575443 ± 19%  proc-vmstat.numa_hint_faults_local
  79375488           +21.6%   96482937        proc-vmstat.numa_hit
  79138861           +21.6%   96237560        proc-vmstat.numa_local
    330580 ±  9%     +27.1%     420192 ±  5%  proc-vmstat.numa_pages_migrated
    808808 ± 11%     +43.0%    1156712 ±  9%  proc-vmstat.numa_pte_updates
  83384617           +26.0%   1.05e+08        proc-vmstat.pgalloc_normal
  22326472           +22.9%   27448052        proc-vmstat.pgfault
  80530234           +26.2%  1.017e+08        proc-vmstat.pgfree
    330580 ±  9%     +27.1%     420192 ±  5%  proc-vmstat.pgmigrate_success
    261994 ±  8%     +39.8%     366207 ±  7%  proc-vmstat.pgreuse
   4612194 ±  2%     +62.7%    7503881        sched_debug.cfs_rq:/.avg_vruntime.avg
   5440180 ± 13%     +85.6%   10099394 ±  2%  sched_debug.cfs_rq:/.avg_vruntime.max
    501155 ± 64%    +329.5%    2152678 ±  6%  sched_debug.cfs_rq:/.avg_vruntime.stddev
      2.13 ±  9%     -47.3%       1.12 ± 18%  sched_debug.cfs_rq:/.h_nr_queued.avg
     44.33 ± 10%     -55.6%      19.67 ± 47%  sched_debug.cfs_rq:/.h_nr_queued.max
      5.09 ±  5%     -53.8%       2.35 ± 26%  sched_debug.cfs_rq:/.h_nr_queued.stddev
      2.09 ±  9%     -47.9%       1.09 ± 19%  sched_debug.cfs_rq:/.h_nr_runnable.avg
     44.25 ± 10%     -55.7%      19.58 ± 47%  sched_debug.cfs_rq:/.h_nr_runnable.max
      5.05 ±  5%     -54.2%       2.31 ± 27%  sched_debug.cfs_rq:/.h_nr_runnable.stddev
   5340703 ± 12%     +85.8%    9925031 ±  2%  sched_debug.cfs_rq:/.left_deadline.max
   2202572 ±  2%     +55.2%    3417743 ±  9%  sched_debug.cfs_rq:/.left_deadline.stddev
   5340659 ± 12%     +85.8%    9924585 ±  2%  sched_debug.cfs_rq:/.left_vruntime.max
   2202531 ±  2%     +55.2%    3417686 ±  9%  sched_debug.cfs_rq:/.left_vruntime.stddev
    313473 ±  6%     -24.8%     235882 ± 22%  sched_debug.cfs_rq:/.load.avg
   4612199 ±  2%     +62.7%    7503887        sched_debug.cfs_rq:/.min_vruntime.avg
   5440184 ± 13%     +85.6%   10099394 ±  2%  sched_debug.cfs_rq:/.min_vruntime.max
    501154 ± 64%    +329.5%    2152680 ±  6%  sched_debug.cfs_rq:/.min_vruntime.stddev
      0.60 ±  6%     -19.5%       0.49 ± 13%  sched_debug.cfs_rq:/.nr_queued.avg
   5340667 ± 12%     +85.8%    9924585 ±  2%  sched_debug.cfs_rq:/.right_vruntime.max
   2202534 ±  2%     +55.2%    3417691 ±  9%  sched_debug.cfs_rq:/.right_vruntime.stddev
    364.26 ±  3%     +16.6%     424.72 ±  2%  sched_debug.cfs_rq:/.util_avg.avg
      1206 ± 23%     +53.8%       1856 ± 26%  sched_debug.cfs_rq:/.util_est.max
    209.57 ±  9%     +27.9%     268.09 ± 11%  sched_debug.cfs_rq:/.util_est.stddev
    360185 ±  5%     +68.1%     605388 ± 15%  sched_debug.cpu.curr->pid.avg
    401600 ±  3%    +120.0%     883327 ±  5%  sched_debug.cpu.curr->pid.stddev
      2.13 ± 10%     -47.0%       1.13 ± 18%  sched_debug.cpu.nr_running.avg
     44.25 ± 10%     -55.6%      19.67 ± 47%  sched_debug.cpu.nr_running.max
      5.08 ±  5%     -53.8%       2.35 ± 25%  sched_debug.cpu.nr_running.stddev
     98005           +37.5%     134753        sched_debug.cpu.nr_switches.avg
    178454 ±  8%    +106.9%     369189 ±  4%  sched_debug.cpu.nr_switches.max
     16050 ± 34%    +376.0%      76393 ±  3%  sched_debug.cpu.nr_switches.stddev
      3.76           +13.7%       4.27        perf-stat.i.MPKI
 1.873e+10            +6.2%  1.989e+10        perf-stat.i.branch-instructions
      0.61            +0.1        0.69        perf-stat.i.branch-miss-rate%
 1.096e+08           +21.8%  1.335e+08        perf-stat.i.branch-misses
     40.32            -2.7       37.62        perf-stat.i.cache-miss-rate%
 3.087e+08           +22.7%  3.787e+08        perf-stat.i.cache-misses
 7.635e+08           +31.5%  1.004e+09        perf-stat.i.cache-references
    712864           +38.1%     984398        perf-stat.i.context-switches
      7.63           -10.6%       6.82        perf-stat.i.cpi
 6.279e+11            -3.7%  6.047e+11        perf-stat.i.cpu-cycles
      2027           -21.4%       1593        perf-stat.i.cycles-between-cache-misses
 8.232e+10            +7.9%  8.881e+10        perf-stat.i.instructions
      0.14           +10.8%       0.15        perf-stat.i.ipc
      8.13           +26.5%      10.29        perf-stat.i.metric.K/sec
    369735           +22.0%     450981        perf-stat.i.minor-faults
    532034           +22.5%     651748        perf-stat.i.page-faults
      3.76           +13.3%       4.26        perf-stat.overall.MPKI
      0.58            +0.1        0.67        perf-stat.overall.branch-miss-rate%
     40.43            -2.7       37.76        perf-stat.overall.cache-miss-rate%
      7.66           -11.4%       6.79        perf-stat.overall.cpi
      2038           -21.8%       1594        perf-stat.overall.cycles-between-cache-misses
      0.13           +12.8%       0.15        perf-stat.overall.ipc
 1.821e+10            +7.3%  1.954e+10        perf-stat.ps.branch-instructions
 1.057e+08           +23.2%  1.302e+08        perf-stat.ps.branch-misses
 3.007e+08           +23.6%  3.717e+08        perf-stat.ps.cache-misses
 7.438e+08           +32.4%  9.845e+08        perf-stat.ps.cache-references
    696299           +38.7%     965478        perf-stat.ps.context-switches
 6.131e+11            -3.4%  5.925e+11        perf-stat.ps.cpu-cycles
     8e+10            +9.0%  8.724e+10        perf-stat.ps.instructions
    356195           +23.6%     440270        perf-stat.ps.minor-faults
    514755           +23.8%     637135        perf-stat.ps.page-faults
 4.867e+12            +9.3%  5.319e+12        perf-stat.total.instructions
     74.42 ± 44%     -60.3       14.16 ±223%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
     74.41 ± 44%     -60.3       14.16 ±223%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
     46.44 ± 44%     -41.7        4.73 ±223%  perf-profile.calltrace.cycles-pp.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe
     46.44 ± 44%     -41.7        4.73 ±223%  perf-profile.calltrace.cycles-pp.__x64_sys_exit.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe
     46.43 ± 44%     -41.7        4.72 ±223%  perf-profile.calltrace.cycles-pp.do_exit.__x64_sys_exit.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe
     45.72 ± 44%     -41.2        4.50 ±223%  perf-profile.calltrace.cycles-pp.exit_notify.do_exit.__x64_sys_exit.x64_sys_call.do_syscall_64
     23.46 ± 44%     -23.5        0.00        perf-profile.calltrace.cycles-pp.queued_write_lock_slowpath.exit_notify.do_exit.__x64_sys_exit.x64_sys_call
     23.34 ± 44%     -23.3        0.00        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath.queued_write_lock_slowpath.exit_notify.do_exit.__x64_sys_exit
     23.33 ± 45%     -23.3        0.00        perf-profile.calltrace.cycles-pp.queued_write_lock_slowpath.copy_process.kernel_clone.__do_sys_clone3.do_syscall_64
     23.24 ± 45%     -23.2        0.00        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath.queued_write_lock_slowpath.copy_process.kernel_clone.__do_sys_clone3
     21.68 ± 44%     -21.7        0.00        perf-profile.calltrace.cycles-pp.queued_write_lock_slowpath.release_task.exit_notify.do_exit.__x64_sys_exit
     21.54 ± 44%     -21.5        0.00        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath.queued_write_lock_slowpath.release_task.exit_notify.do_exit
     27.26 ± 45%     -18.0        9.26 ±223%  perf-profile.calltrace.cycles-pp.__do_sys_clone3.do_syscall_64.entry_SYSCALL_64_after_hwframe
     27.26 ± 45%     -18.0        9.26 ±223%  perf-profile.calltrace.cycles-pp.kernel_clone.__do_sys_clone3.do_syscall_64.entry_SYSCALL_64_after_hwframe
     22.09 ± 44%     -17.6        4.45 ±223%  perf-profile.calltrace.cycles-pp.release_task.exit_notify.do_exit.__x64_sys_exit.x64_sys_call
     26.16 ± 45%     -17.2        8.99 ±223%  perf-profile.calltrace.cycles-pp.copy_process.kernel_clone.__do_sys_clone3.do_syscall_64.entry_SYSCALL_64_after_hwframe
      5.24 ± 47%      -3.9        1.38 ±223%  perf-profile.calltrace.cycles-pp.__madvise
      5.24 ± 47%      -3.9        1.38 ±223%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
      5.24 ± 47%      -3.9        1.38 ±223%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__madvise
      5.24 ± 47%      -3.9        1.38 ±223%  perf-profile.calltrace.cycles-pp.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
      5.24 ± 47%      -3.9        1.38 ±223%  perf-profile.calltrace.cycles-pp.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
      5.18 ± 47%      -3.8        1.37 ±223%  perf-profile.calltrace.cycles-pp.madvise_vma_behavior.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe
      5.18 ± 47%      -3.8        1.36 ±223%  perf-profile.calltrace.cycles-pp.zap_page_range_single.madvise_vma_behavior.do_madvise.__x64_sys_madvise.do_syscall_64
      5.08 ± 47%      -3.7        1.34 ±223%  perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.tlb_finish_mmu.zap_page_range_single.madvise_vma_behavior.do_madvise
      5.08 ± 47%      -3.7        1.34 ±223%  perf-profile.calltrace.cycles-pp.tlb_finish_mmu.zap_page_range_single.madvise_vma_behavior.do_madvise.__x64_sys_madvise
      5.07 ± 47%      -3.7        1.34 ±223%  perf-profile.calltrace.cycles-pp.on_each_cpu_cond_mask.flush_tlb_mm_range.tlb_finish_mmu.zap_page_range_single.madvise_vma_behavior
      5.06 ± 47%      -3.7        1.33 ±223%  perf-profile.calltrace.cycles-pp.smp_call_function_many_cond.on_each_cpu_cond_mask.flush_tlb_mm_range.tlb_finish_mmu.zap_page_range_single
     68.48 ± 44%     -68.4        0.09 ±223%  perf-profile.children.cycles-pp.queued_write_lock_slowpath
     81.41 ± 44%     -65.4       16.02 ±223%  perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     81.40 ± 44%     -65.4       16.01 ±223%  perf-profile.children.cycles-pp.do_syscall_64
     70.40 ± 44%     -57.1       13.32 ±223%  perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
     46.45 ± 44%     -41.7        4.73 ±223%  perf-profile.children.cycles-pp.x64_sys_call
     46.44 ± 44%     -41.7        4.73 ±223%  perf-profile.children.cycles-pp.do_exit
     46.44 ± 44%     -41.7        4.73 ±223%  perf-profile.children.cycles-pp.__x64_sys_exit
     45.74 ± 44%     -41.2        4.50 ±223%  perf-profile.children.cycles-pp.exit_notify
     27.26 ± 45%     -18.0        9.26 ±223%  perf-profile.children.cycles-pp.__do_sys_clone3
     27.26 ± 45%     -18.0        9.26 ±223%  perf-profile.children.cycles-pp.kernel_clone
     22.11 ± 44%     -17.7        4.45 ±223%  perf-profile.children.cycles-pp.release_task
     26.18 ± 45%     -17.2        8.99 ±223%  perf-profile.children.cycles-pp.copy_process
      5.38 ± 47%      -4.0        1.38 ±223%  perf-profile.children.cycles-pp.tlb_finish_mmu
      5.30 ± 47%      -3.9        1.36 ±223%  perf-profile.children.cycles-pp.on_each_cpu_cond_mask
      5.30 ± 47%      -3.9        1.36 ±223%  perf-profile.children.cycles-pp.smp_call_function_many_cond
      5.30 ± 47%      -3.9        1.37 ±223%  perf-profile.children.cycles-pp.flush_tlb_mm_range
      5.25 ± 47%      -3.9        1.38 ±223%  perf-profile.children.cycles-pp.__madvise
      5.24 ± 47%      -3.9        1.38 ±223%  perf-profile.children.cycles-pp.__x64_sys_madvise
      5.24 ± 47%      -3.9        1.38 ±223%  perf-profile.children.cycles-pp.do_madvise
      5.18 ± 47%      -3.8        1.37 ±223%  perf-profile.children.cycles-pp.madvise_vma_behavior
      5.18 ± 47%      -3.8        1.36 ±223%  perf-profile.children.cycles-pp.zap_page_range_single
     70.39 ± 44%     -57.1       13.32 ±223%  perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
      5.16 ± 47%      -3.9        1.30 ±223%  perf-profile.self.cycles-pp.smp_call_function_many_cond


***************************************************************************************************
lkp-spr-2sp4: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 512G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  gcc-12/performance/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-spr-2sp4/vfork/stress-ng/60s

commit: 
  74198dc206 ("pid: sprinkle tasklist_lock asserts")
  7903f907a2 ("pid: perform free_pid() calls outside of tasklist_lock")

74198dc2067b2aa1 7903f907a226058ed99f86e9924 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
   6562366 ±  8%     +37.0%    8993652 ± 10%  cpuidle..usage
      0.29            +0.1        0.39        mpstat.cpu.all.soft%
    486692           +31.8%     641303        vmstat.system.cs
    506323            +4.8%     530409        vmstat.system.in
   4004574 ±  3%      +8.7%    4353640 ±  3%  meminfo.Active
   4004574 ±  3%      +8.7%    4353640 ±  3%  meminfo.Active(anon)
   2657761 ±  6%     +15.5%    3069404 ±  5%  meminfo.Shmem
   3257759 ± 11%     +14.3%    3724594 ±  7%  numa-meminfo.node1.Active
   3257759 ± 11%     +14.3%    3724594 ±  7%  numa-meminfo.node1.Active(anon)
   2492828 ±  9%     +21.0%    3017306 ±  6%  numa-meminfo.node1.Shmem
   9063611 ±  2%     +36.5%   12368884 ±  9%  numa-numastat.node0.local_node
   9220375 ±  2%     +35.7%   12513653 ±  9%  numa-numastat.node0.numa_hit
  10168176           +28.3%   13044773        numa-numastat.node1.local_node
  10243149           +28.2%   13131946        numa-numastat.node1.numa_hit
      5700 ±  8%     +47.9%       8432 ± 11%  perf-c2c.DRAM.remote
     14297 ±  7%     +42.5%      20373 ± 12%  perf-c2c.HITM.local
      3624 ±  8%     +54.4%       5597 ± 11%  perf-c2c.HITM.remote
     17922 ±  7%     +44.9%      25970 ± 12%  perf-c2c.HITM.total
     51838 ± 45%     -56.5%      22543 ±105%  numa-vmstat.node0.nr_mapped
   9221619 ±  2%     +35.2%   12469913 ±  9%  numa-vmstat.node0.numa_hit
   9064856 ±  2%     +36.0%   12325144 ± 10%  numa-vmstat.node0.numa_local
    623443 ±  9%     +20.6%     752138 ±  6%  numa-vmstat.node1.nr_shmem
  10243633           +27.8%   13088671        numa-vmstat.node1.numa_hit
  10168660           +27.9%   13001498        numa-vmstat.node1.numa_local
   1378378           +18.3%    1630343        stress-ng.time.involuntary_context_switches
     10647            -3.1%      10321        stress-ng.time.system_time
      1838           +13.8%       2092        stress-ng.time.user_time
  16431508           +30.8%   21498222        stress-ng.time.voluntary_context_switches
   8890752           +28.7%   11442483        stress-ng.vfork.ops
    148177           +28.7%     190706        stress-ng.vfork.ops_per_sec
   1000826 ±  3%      +8.9%    1090125 ±  3%  proc-vmstat.nr_active_anon
   1545626 ±  2%      +6.8%    1650840 ±  2%  proc-vmstat.nr_file_pages
    120475            +2.9%     124024        proc-vmstat.nr_mapped
    663632 ±  6%     +15.9%     768846 ±  5%  proc-vmstat.nr_shmem
   1000826 ±  3%      +8.9%    1090125 ±  3%  proc-vmstat.nr_zone_active_anon
  19510114           +31.5%   25647538 ±  4%  proc-vmstat.numa_hit
  19278378           +31.8%   25415597 ±  4%  proc-vmstat.numa_local
  22280233           +32.9%   29608930 ±  4%  proc-vmstat.pgalloc_normal
  20644303           +35.1%   27885848 ±  4%  proc-vmstat.pgfree
      1.03           +18.9%       1.22 ±  2%  perf-stat.i.MPKI
 1.703e+10            +6.2%  1.809e+10        perf-stat.i.branch-instructions
      0.53 ±  2%      +0.1        0.59 ±  4%  perf-stat.i.branch-miss-rate%
  88001361 ±  3%     +17.3%  1.032e+08 ±  5%  perf-stat.i.branch-misses
  74412375           +27.9%   95182974        perf-stat.i.cache-misses
 7.674e+08 ±  3%     +26.4%  9.698e+08 ±  4%  perf-stat.i.cache-references
    503132           +32.0%     664329        perf-stat.i.context-switches
      8.49            -7.5%       7.85        perf-stat.i.cpi
    112807 ±  2%     +23.7%     139583 ±  5%  perf-stat.i.cpu-migrations
      8617           -23.1%       6627        perf-stat.i.cycles-between-cache-misses
 7.368e+10            +7.4%  7.917e+10        perf-stat.i.instructions
      0.12            +8.3%       0.13        perf-stat.i.ipc
      2.25           +31.7%       2.97        perf-stat.i.metric.K/sec
      1.02           +18.9%       1.21        perf-stat.overall.MPKI
      0.50 ±  2%      +0.1        0.56 ±  3%  perf-stat.overall.branch-miss-rate%
      8.55            -7.5%       7.91        perf-stat.overall.cpi
      8374           -22.2%       6517        perf-stat.overall.cycles-between-cache-misses
      0.12            +8.1%       0.13        perf-stat.overall.ipc
 1.655e+10            +6.2%  1.758e+10        perf-stat.ps.branch-instructions
  82996740 ±  3%     +17.8%   97762479 ±  5%  perf-stat.ps.branch-misses
  73065238           +27.7%   93297913        perf-stat.ps.cache-misses
 7.509e+08 ±  3%     +26.3%  9.487e+08 ±  4%  perf-stat.ps.cache-references
    491567           +32.0%     649035        perf-stat.ps.context-switches
    110242 ±  2%     +23.6%     136250 ±  4%  perf-stat.ps.cpu-migrations
 7.159e+10            +7.4%   7.69e+10        perf-stat.ps.instructions
     11850 ±  2%      +6.0%      12559 ±  3%  perf-stat.ps.minor-faults
     11850 ±  2%      +6.0%      12559 ±  3%  perf-stat.ps.page-faults
 4.334e+12            +8.1%  4.684e+12        perf-stat.total.instructions
      0.55 ± 10%     -29.3%       0.39 ± 13%  perf-sched.sch_delay.avg.ms.__cond_resched.__kmalloc_cache_node_noprof.__get_vm_area_node.__vmalloc_node_range_noprof.alloc_thread_stack_node
      0.80 ±  3%     -31.4%       0.55 ±  6%  perf-sched.sch_delay.avg.ms.__cond_resched.__kmalloc_node_noprof.__vmalloc_area_node.__vmalloc_node_range_noprof.alloc_thread_stack_node
      0.94 ±  3%     -31.1%       0.65 ±  2%  perf-sched.sch_delay.avg.ms.__cond_resched.__vmalloc_area_node.__vmalloc_node_range_noprof.alloc_thread_stack_node.dup_task_struct
      0.30 ±  2%     -14.5%       0.26 ±  4%  perf-sched.sch_delay.avg.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
      0.37           -28.9%       0.27        perf-sched.sch_delay.avg.ms.__cond_resched.__wait_for_common.wait_for_completion_state.kernel_clone.__x64_sys_vfork
      0.81 ± 12%     -28.8%       0.58 ± 10%  perf-sched.sch_delay.avg.ms.__cond_resched.alloc_pages_bulk_noprof.__vmalloc_area_node.__vmalloc_node_range_noprof.alloc_thread_stack_node
      0.76 ±  4%     -43.4%       0.43 ±  3%  perf-sched.sch_delay.avg.ms.__cond_resched.cgroup_css_set_fork.cgroup_can_fork.copy_process.kernel_clone
      0.42 ± 16%     -45.4%       0.23 ± 15%  perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_node_noprof.dup_task_struct.copy_process.kernel_clone
      0.81           -38.6%       0.50 ±  5%  perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.alloc_pid.copy_process.kernel_clone
      0.92           -31.7%       0.63 ±  8%  perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.copy_fs_struct.copy_process.kernel_clone
      0.87 ±  3%     -33.4%       0.58 ±  8%  perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.copy_sighand.copy_process.kernel_clone
      0.86 ±  8%     -32.5%       0.58 ±  7%  perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.copy_signal.copy_process.kernel_clone
      0.96 ±  5%     -36.0%       0.61 ±  4%  perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.dup_fd.copy_process.kernel_clone
      0.85           -38.0%       0.53 ±  3%  perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.prepare_creds.copy_creds.copy_process
      0.34 ± 33%     -57.1%       0.15 ± 82%  perf-sched.sch_delay.avg.ms.__cond_resched.kvfree_rcu_drain_ready.kfree_rcu_monitor.process_one_work.worker_thread
      0.04 ±  3%     -20.9%       0.04 ±  6%  perf-sched.sch_delay.avg.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.17 ±  9%     -31.5%       0.11 ± 16%  perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
      0.23           -18.1%       0.19        perf-sched.sch_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
      0.30           -20.7%       0.24 ±  2%  perf-sched.sch_delay.avg.ms.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
      0.10 ±  6%     -18.2%       0.08 ±  5%  perf-sched.sch_delay.avg.ms.syscall_exit_to_user_mode.ret_from_fork_asm.[unknown].[unknown]
      0.13           -18.4%       0.11 ±  2%  perf-sched.sch_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
      1.64 ± 33%     -34.6%       1.07 ± 20%  perf-sched.sch_delay.max.ms.__cond_resched.kmem_cache_alloc_node_noprof.dup_task_struct.copy_process.kernel_clone
      0.43 ± 28%     -41.7%       0.25 ± 31%  perf-sched.sch_delay.max.ms.__cond_resched.mmput.exit_mm.do_exit.__x64_sys_exit
      0.78 ± 19%     -42.2%       0.45 ± 25%  perf-sched.sch_delay.max.ms.__cond_resched.stop_one_cpu.sched_exec.bprm_execve.part
      0.13           -20.3%       0.10        perf-sched.total_sch_delay.average.ms
     59.45 ± 12%     -21.3%      46.77 ±  9%  perf-sched.total_sch_delay.max.ms
      2.32           -18.5%       1.89        perf-sched.total_wait_and_delay.average.ms
   1656374           +26.0%    2087010        perf-sched.total_wait_and_delay.count.ms
      2.20           -18.4%       1.79        perf-sched.total_wait_time.average.ms
      0.90           -26.7%       0.66        perf-sched.wait_and_delay.avg.ms.__cond_resched.__wait_for_common.wait_for_completion_state.kernel_clone.__x64_sys_vfork
     20.62 ±  6%     -43.0%      11.74 ±  2%  perf-sched.wait_and_delay.avg.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      0.17 ±  2%     -18.4%       0.14 ±  5%  perf-sched.wait_and_delay.avg.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
     60.43 ± 19%     +76.4%     106.62 ± 33%  perf-sched.wait_and_delay.avg.ms.pipe_read.vfs_read.ksys_read.do_syscall_64
      0.65           -18.1%       0.53        perf-sched.wait_and_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
     56.03 ±  3%     -45.1%      30.75 ±  2%  perf-sched.wait_and_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      0.89 ±  3%     -17.5%       0.73 ±  7%  perf-sched.wait_and_delay.avg.ms.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
     10.82           -15.3%       9.17        perf-sched.wait_and_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
     33654            -9.5%      30471        perf-sched.wait_and_delay.count.__cond_resched.__wait_for_common.wait_for_completion_state.kernel_clone.__x64_sys_vfork
      1689 ±  8%    +168.2%       4529 ±  8%  perf-sched.wait_and_delay.count.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
     59.50 ±  6%     +39.5%      83.00 ± 11%  perf-sched.wait_and_delay.count.__cond_resched.vunmap_p4d_range.__vunmap_range_noflush.remove_vm_area.vfree
    675414           +24.7%     842197        perf-sched.wait_and_delay.count.do_task_dead.do_exit.__x64_sys_exit.x64_sys_call.do_syscall_64
     69934 ±  4%     +46.4%     102383 ±  6%  perf-sched.wait_and_delay.count.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
      1118 ± 19%     -36.7%     708.00 ± 28%  perf-sched.wait_and_delay.count.pipe_read.vfs_read.ksys_read.do_syscall_64
    652564           +25.8%     821118        perf-sched.wait_and_delay.count.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
     36347 ±  3%     +89.4%      68847 ±  2%  perf-sched.wait_and_delay.count.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
     62439           +16.9%      72971        perf-sched.wait_and_delay.count.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
    104431           +18.2%     123395        perf-sched.wait_and_delay.count.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
      3.18 ±183%     -87.1%       0.41 ± 14%  perf-sched.wait_time.avg.ms.__cond_resched.__kmalloc_cache_node_noprof.__get_vm_area_node.__vmalloc_node_range_noprof.alloc_thread_stack_node
      0.83 ±  3%     -30.1%       0.58 ±  6%  perf-sched.wait_time.avg.ms.__cond_resched.__kmalloc_node_noprof.__vmalloc_area_node.__vmalloc_node_range_noprof.alloc_thread_stack_node
      1.28 ± 57%     -47.5%       0.67 ±  2%  perf-sched.wait_time.avg.ms.__cond_resched.__vmalloc_area_node.__vmalloc_node_range_noprof.alloc_thread_stack_node.dup_task_struct
      0.52           -25.1%       0.39        perf-sched.wait_time.avg.ms.__cond_resched.__wait_for_common.wait_for_completion_state.kernel_clone.__x64_sys_vfork
      0.85 ± 12%     -34.9%       0.55 ± 17%  perf-sched.wait_time.avg.ms.__cond_resched.alloc_pages_bulk_noprof.__vmalloc_area_node.__vmalloc_node_range_noprof.alloc_thread_stack_node
      0.80 ±  5%     -37.6%       0.50        perf-sched.wait_time.avg.ms.__cond_resched.cgroup_css_set_fork.cgroup_can_fork.copy_process.kernel_clone
      0.79 ± 26%     -37.0%       0.50 ± 19%  perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_node_noprof.alloc_vmap_area.__get_vm_area_node.__vmalloc_node_range_noprof
      0.51 ±  9%     -42.1%       0.30 ± 12%  perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_node_noprof.dup_task_struct.copy_process.kernel_clone
      0.94           -31.8%       0.64 ±  2%  perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_noprof.copy_fs_struct.copy_process.kernel_clone
      0.90 ±  2%     -32.1%       0.61 ±  6%  perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_noprof.copy_sighand.copy_process.kernel_clone
      0.89 ±  8%     -31.4%       0.61 ±  6%  perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_noprof.copy_signal.copy_process.kernel_clone
      0.96 ±  2%     -33.2%       0.64 ±  4%  perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_noprof.dup_fd.copy_process.kernel_clone
      0.88           -34.6%       0.57 ±  2%  perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_noprof.prepare_creds.copy_creds.copy_process
     20.58 ±  6%     -43.1%      11.71 ±  2%  perf-sched.wait_time.avg.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      0.13 ±  3%     -17.6%       0.11 ±  4%  perf-sched.wait_time.avg.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
     60.37 ± 19%     +76.5%     106.54 ± 33%  perf-sched.wait_time.avg.ms.pipe_read.vfs_read.ksys_read.do_syscall_64
      0.41           -17.9%       0.34        perf-sched.wait_time.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
     55.91 ±  3%     -45.2%      30.65 ±  2%  perf-sched.wait_time.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      0.58 ±  6%     -15.8%       0.49 ± 11%  perf-sched.wait_time.avg.ms.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
     10.69           -15.3%       9.06        perf-sched.wait_time.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
      1.25           -23.3%       0.96 ± 13%  perf-sched.wait_time.max.ms.__cond_resched.alloc_pages_bulk_noprof.__vmalloc_area_node.__vmalloc_node_range_noprof.alloc_thread_stack_node
      1.65 ± 34%     -34.4%       1.08 ± 19%  perf-sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc_node_noprof.dup_task_struct.copy_process.kernel_clone
     44.32 ± 19%     -26.5%      32.59 ± 11%  perf-sched.wait_time.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2025-02-19  5:46 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-19  5:46 [linux-next:master] [pid] 7903f907a2: stress-ng.pthread.ops_per_sec 23.4% improvement kernel test robot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.