All of lore.kernel.org
 help / color / mirror / Atom feed
* [linus:master] [mm]  1ced09e033: pmbench.throughput.aps 3.5% improvement
@ 2024-11-29  2:59 kernel test robot
  0 siblings, 0 replies; only message in thread
From: kernel test robot @ 2024-11-29  2:59 UTC (permalink / raw)
  To: Dev Jain; +Cc: oe-lkp, oliver.sang



Hello,

kernel test robot noticed a 3.5% improvement of pmbench.throughput.aps on:

commit: 1ced09e0331f6cc4ca7eae75bc0ef03957129a94 ("mm: allocate THP on hugezeropage wp-fault")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


testcase: pmbench
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 16 threads 1 sockets Intel(R) Xeon(R) E-2278G CPU @ 3.40GHz (Coffee Lake-E) with 32G memory
parameters:

	runtime: 300s
	nr_threads: 100%
	mapsize: 75%
	cold: 1
	cpufreq_governor: performance






Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20241129/202411291031.595ea459-lkp@intel.com

=========================================================================================
cold/compiler/cpufreq_governor/kconfig/mapsize/nr_threads/rootfs/runtime/tbox_group/testcase:
  1/gcc-12/performance/x86_64-rhel-9.4/75%/100%/debian-12-x86_64-20240206.cgz/300s/lkp-cfl-e1/pmbench

commit: 
  ebcfc63d6b ("mm: abstract THP allocation")
  1ced09e033 ("mm: allocate THP on hugezeropage wp-fault")

ebcfc63d6bca3cce 1ced09e0331f6cc4ca7eae75bc0 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    293877           +26.1%     370613 ±  2%  meminfo.AnonHugePages
    114.20 ± 12%     +37.3%     156.80 ±  8%  perf-c2c.HITM.local
     19618            -1.4%      19349        vmstat.system.in
     84.80 ±  9%     -19.1%      68.60 ± 13%  perf-sched.wait_and_delay.count.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
    162.00 ± 11%     -19.1%     131.10 ± 14%  perf-sched.wait_and_delay.count.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
    389089            -5.8%     366424        proc-vmstat.numa_hit
    389089            -5.8%     366424        proc-vmstat.numa_local
    486747            -4.8%     463625        proc-vmstat.pgfault
     94.78            -3.3%      91.62 ±  2%  pmbench.latency.ns.average
      0.01            -0.0        0.01 ±  3%  pmbench.read.latency.ns.2K-4K%
      0.40            +0.0        0.45 ± 14%  pmbench.read.latency.ns.512-1K%
     21835 ±  6%     -95.1%       1067        pmbench.time.minor_page_faults
      0.00 ±  5%      -0.0        0.00 ±  9%  pmbench.write.latency.ns.128K-256K%
      0.00 ±  8%      +0.0        0.00 ± 14%  pmbench.write.latency.ns.1M-2M%
      0.00 ±  9%      -0.0        0.00        pmbench.write.latency.ns.256-512%
      0.00 ±  3%      -0.0        0.00        pmbench.write.latency.ns.2K-4K%
      0.00 ±  4%      -0.0        0.00 ± 31%  pmbench.write.latency.ns.2M-4M%
      0.00 ±  5%      -0.0        0.00 ±  3%  pmbench.write.latency.ns.32K-64K%
      0.00 ±  3%      -0.0        0.00 ±  8%  pmbench.write.latency.ns.4K-8K%
      0.00 ± 18%      -0.0        0.00 ± 14%  pmbench.write.latency.ns.512-1K%
      0.00 ±  7%      -0.0        0.00 ±  6%  pmbench.write.latency.ns.512K-1M%
      0.00 ±  8%      -0.0        0.00 ±  6%  pmbench.write.latency.ns.64K-128K%
      0.00 ± 13%      -0.0        0.00 ± 14%  pmbench.write.latency.ns.8K-16K%
 1.688e+08            +3.5%  1.747e+08        pmbench.throughput.aps
      0.44 ±  4%      -0.0        0.39 ± 15%  pmbench.throughput.std_dev%
 4.161e+09            +3.9%  4.322e+09        perf-stat.i.branch-instructions
 1.198e+08            +4.5%  1.253e+08        perf-stat.i.branch-misses
     68.01           +18.3       86.33        perf-stat.i.cache-miss-rate%
 5.197e+08            +3.9%    5.4e+08        perf-stat.i.cache-misses
 7.617e+08           -18.2%  6.228e+08        perf-stat.i.cache-references
      3.41            -4.2%       3.27        perf-stat.i.cpi
 1.973e+10            +3.9%   2.05e+10        perf-stat.i.instructions
      0.30            +4.4%       0.31        perf-stat.i.ipc
      0.01 ± 12%    -100.0%       0.00        perf-stat.i.metric.K/sec
      1293            -5.0%       1228        perf-stat.i.minor-faults
      1293            -5.0%       1228        perf-stat.i.page-faults
     68.24           +18.5       86.71        perf-stat.overall.cache-miss-rate%
      3.40            -4.0%       3.26        perf-stat.overall.cpi
    129.03            -4.0%     123.85        perf-stat.overall.cycles-between-cache-misses
      0.29            +4.2%       0.31        perf-stat.overall.ipc
 4.145e+09            +3.9%  4.306e+09        perf-stat.ps.branch-instructions
 1.194e+08            +4.5%  1.248e+08        perf-stat.ps.branch-misses
 5.176e+08            +3.9%   5.38e+08        perf-stat.ps.cache-misses
 7.587e+08           -18.2%  6.204e+08        perf-stat.ps.cache-references
 1.966e+10            +3.9%  2.042e+10        perf-stat.ps.instructions
      1290            -5.0%       1224        perf-stat.ps.minor-faults
      1290            -5.0%       1224        perf-stat.ps.page-faults
      3.48 ±  5%      -2.9        0.60        perf-profile.calltrace.cycles-pp.access_histogram._ops_rdtsc_init_base_freq
     14.48            -2.4       12.10        perf-profile.calltrace.cycles-pp._ops_rdtsc_init_base_freq
      0.64 ±  2%      +0.0        0.67        perf-profile.calltrace.cycles-pp.roll_dice
      1.10            +0.1        1.18 ±  3%  perf-profile.calltrace.cycles-pp.record_histogram
      4.88            +0.3        5.14        perf-profile.calltrace.cycles-pp.main_bm_thread
      7.46            +0.4        7.85        perf-profile.calltrace.cycles-pp.access_histogram
      9.86            +0.4       10.29        perf-profile.calltrace.cycles-pp._ops_rdtscp
     52.15            +0.5       52.70        perf-profile.calltrace.cycles-pp.measure_read
     11.64            +0.6       12.23        perf-profile.calltrace.cycles-pp._ops_rdtscp._ops_rdtsc_init_base_freq
      8.85            +0.6        9.47        perf-profile.calltrace.cycles-pp.uniform_get_number
     11.18            -2.4        8.76        perf-profile.children.cycles-pp.access_histogram
     14.48            -2.4       12.10        perf-profile.children.cycles-pp._ops_rdtsc_init_base_freq
      0.71            +0.0        0.74        perf-profile.children.cycles-pp.roll_dice
      6.21            +0.3        6.54        perf-profile.children.cycles-pp.main_bm_thread
     52.43            +0.6       52.99        perf-profile.children.cycles-pp.measure_read
      8.85            +0.6        9.48        perf-profile.children.cycles-pp.uniform_get_number
     20.90            +0.9       21.83        perf-profile.children.cycles-pp._ops_rdtscp
     10.27 ±  2%      -2.5        7.80        perf-profile.self.cycles-pp.access_histogram
      0.54 ±  2%      +0.0        0.58 ±  2%  perf-profile.self.cycles-pp.roll_dice
      5.92            +0.3        6.26        perf-profile.self.cycles-pp.main_bm_thread
     51.97            +0.5       52.51        perf-profile.self.cycles-pp.measure_read
      8.67            +0.6        9.28        perf-profile.self.cycles-pp.uniform_get_number
     19.98            +0.8       20.81        perf-profile.self.cycles-pp._ops_rdtscp




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2024-11-29  3:01 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-29  2:59 [linus:master] [mm] 1ced09e033: pmbench.throughput.aps 3.5% improvement kernel test robot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.