[linus:master] [mm] 1ced09e033: pmbench.throughput.aps 3.5% improvement

All of lore.kernel.org
 help / color / mirror / Atom feed

From: kernel test robot <oliver.sang@intel.com>
To: Dev Jain <dev.jain@arm.com>
Cc: <oe-lkp@lists.linux.dev>, <oliver.sang@intel.com>
Subject: [linus:master] [mm]  1ced09e033: pmbench.throughput.aps 3.5% improvement
Date: Fri, 29 Nov 2024 10:59:40 +0800	[thread overview]
Message-ID: <202411291031.595ea459-lkp@intel.com> (raw)



Hello,

kernel test robot noticed a 3.5% improvement of pmbench.throughput.aps on:

commit: 1ced09e0331f6cc4ca7eae75bc0ef03957129a94 ("mm: allocate THP on hugezeropage wp-fault")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


testcase: pmbench
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 16 threads 1 sockets Intel(R) Xeon(R) E-2278G CPU @ 3.40GHz (Coffee Lake-E) with 32G memory
parameters:

	runtime: 300s
	nr_threads: 100%
	mapsize: 75%
	cold: 1
	cpufreq_governor: performance






Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20241129/202411291031.595ea459-lkp@intel.com

=========================================================================================
cold/compiler/cpufreq_governor/kconfig/mapsize/nr_threads/rootfs/runtime/tbox_group/testcase:
  1/gcc-12/performance/x86_64-rhel-9.4/75%/100%/debian-12-x86_64-20240206.cgz/300s/lkp-cfl-e1/pmbench

commit: 
  ebcfc63d6b ("mm: abstract THP allocation")
  1ced09e033 ("mm: allocate THP on hugezeropage wp-fault")

ebcfc63d6bca3cce 1ced09e0331f6cc4ca7eae75bc0 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    293877           +26.1%     370613 ±  2%  meminfo.AnonHugePages
    114.20 ± 12%     +37.3%     156.80 ±  8%  perf-c2c.HITM.local
     19618            -1.4%      19349        vmstat.system.in
     84.80 ±  9%     -19.1%      68.60 ± 13%  perf-sched.wait_and_delay.count.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
    162.00 ± 11%     -19.1%     131.10 ± 14%  perf-sched.wait_and_delay.count.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
    389089            -5.8%     366424        proc-vmstat.numa_hit
    389089            -5.8%     366424        proc-vmstat.numa_local
    486747            -4.8%     463625        proc-vmstat.pgfault
     94.78            -3.3%      91.62 ±  2%  pmbench.latency.ns.average
      0.01            -0.0        0.01 ±  3%  pmbench.read.latency.ns.2K-4K%
      0.40            +0.0        0.45 ± 14%  pmbench.read.latency.ns.512-1K%
     21835 ±  6%     -95.1%       1067        pmbench.time.minor_page_faults
      0.00 ±  5%      -0.0        0.00 ±  9%  pmbench.write.latency.ns.128K-256K%
      0.00 ±  8%      +0.0        0.00 ± 14%  pmbench.write.latency.ns.1M-2M%
      0.00 ±  9%      -0.0        0.00        pmbench.write.latency.ns.256-512%
      0.00 ±  3%      -0.0        0.00        pmbench.write.latency.ns.2K-4K%
      0.00 ±  4%      -0.0        0.00 ± 31%  pmbench.write.latency.ns.2M-4M%
      0.00 ±  5%      -0.0        0.00 ±  3%  pmbench.write.latency.ns.32K-64K%
      0.00 ±  3%      -0.0        0.00 ±  8%  pmbench.write.latency.ns.4K-8K%
      0.00 ± 18%      -0.0        0.00 ± 14%  pmbench.write.latency.ns.512-1K%
      0.00 ±  7%      -0.0        0.00 ±  6%  pmbench.write.latency.ns.512K-1M%
      0.00 ±  8%      -0.0        0.00 ±  6%  pmbench.write.latency.ns.64K-128K%
      0.00 ± 13%      -0.0        0.00 ± 14%  pmbench.write.latency.ns.8K-16K%
 1.688e+08            +3.5%  1.747e+08        pmbench.throughput.aps
      0.44 ±  4%      -0.0        0.39 ± 15%  pmbench.throughput.std_dev%
 4.161e+09            +3.9%  4.322e+09        perf-stat.i.branch-instructions
 1.198e+08            +4.5%  1.253e+08        perf-stat.i.branch-misses
     68.01           +18.3       86.33        perf-stat.i.cache-miss-rate%
 5.197e+08            +3.9%    5.4e+08        perf-stat.i.cache-misses
 7.617e+08           -18.2%  6.228e+08        perf-stat.i.cache-references
      3.41            -4.2%       3.27        perf-stat.i.cpi
 1.973e+10            +3.9%   2.05e+10        perf-stat.i.instructions
      0.30            +4.4%       0.31        perf-stat.i.ipc
      0.01 ± 12%    -100.0%       0.00        perf-stat.i.metric.K/sec
      1293            -5.0%       1228        perf-stat.i.minor-faults
      1293            -5.0%       1228        perf-stat.i.page-faults
     68.24           +18.5       86.71        perf-stat.overall.cache-miss-rate%
      3.40            -4.0%       3.26        perf-stat.overall.cpi
    129.03            -4.0%     123.85        perf-stat.overall.cycles-between-cache-misses
      0.29            +4.2%       0.31        perf-stat.overall.ipc
 4.145e+09            +3.9%  4.306e+09        perf-stat.ps.branch-instructions
 1.194e+08            +4.5%  1.248e+08        perf-stat.ps.branch-misses
 5.176e+08            +3.9%   5.38e+08        perf-stat.ps.cache-misses
 7.587e+08           -18.2%  6.204e+08        perf-stat.ps.cache-references
 1.966e+10            +3.9%  2.042e+10        perf-stat.ps.instructions
      1290            -5.0%       1224        perf-stat.ps.minor-faults
      1290            -5.0%       1224        perf-stat.ps.page-faults
      3.48 ±  5%      -2.9        0.60        perf-profile.calltrace.cycles-pp.access_histogram._ops_rdtsc_init_base_freq
     14.48            -2.4       12.10        perf-profile.calltrace.cycles-pp._ops_rdtsc_init_base_freq
      0.64 ±  2%      +0.0        0.67        perf-profile.calltrace.cycles-pp.roll_dice
      1.10            +0.1        1.18 ±  3%  perf-profile.calltrace.cycles-pp.record_histogram
      4.88            +0.3        5.14        perf-profile.calltrace.cycles-pp.main_bm_thread
      7.46            +0.4        7.85        perf-profile.calltrace.cycles-pp.access_histogram
      9.86            +0.4       10.29        perf-profile.calltrace.cycles-pp._ops_rdtscp
     52.15            +0.5       52.70        perf-profile.calltrace.cycles-pp.measure_read
     11.64            +0.6       12.23        perf-profile.calltrace.cycles-pp._ops_rdtscp._ops_rdtsc_init_base_freq
      8.85            +0.6        9.47        perf-profile.calltrace.cycles-pp.uniform_get_number
     11.18            -2.4        8.76        perf-profile.children.cycles-pp.access_histogram
     14.48            -2.4       12.10        perf-profile.children.cycles-pp._ops_rdtsc_init_base_freq
      0.71            +0.0        0.74        perf-profile.children.cycles-pp.roll_dice
      6.21            +0.3        6.54        perf-profile.children.cycles-pp.main_bm_thread
     52.43            +0.6       52.99        perf-profile.children.cycles-pp.measure_read
      8.85            +0.6        9.48        perf-profile.children.cycles-pp.uniform_get_number
     20.90            +0.9       21.83        perf-profile.children.cycles-pp._ops_rdtscp
     10.27 ±  2%      -2.5        7.80        perf-profile.self.cycles-pp.access_histogram
      0.54 ±  2%      +0.0        0.58 ±  2%  perf-profile.self.cycles-pp.roll_dice
      5.92            +0.3        6.26        perf-profile.self.cycles-pp.main_bm_thread
     51.97            +0.5       52.51        perf-profile.self.cycles-pp.measure_read
      8.67            +0.6        9.28        perf-profile.self.cycles-pp.uniform_get_number
     19.98            +0.8       20.81        perf-profile.self.cycles-pp._ops_rdtscp




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

                 reply	other threads:[~2024-11-29  3:01 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=202411291031.595ea459-lkp@intel.com \
    --to=oliver.sang@intel.com \
    --cc=dev.jain@arm.com \
    --cc=oe-lkp@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.