All of lore.kernel.org
 help / color / mirror / Atom feed
From: kernel test robot <oliver.sang@intel.com>
To: Shakeel Butt <shakeel.butt@linux.dev>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
	<linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Harry Yoo <harry.yoo@oracle.com>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	Vlastimil Babka <vbabka@suse.cz>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@kernel.org>,
	Muchun Song <muchun.song@linux.dev>,
	Qi Zheng <zhengqi.arch@bytedance.com>, <cgroups@vger.kernel.org>,
	<linux-mm@kvack.org>, <oliver.sang@intel.com>
Subject: [linus:master] [memcg]  7e44d00a13:  will-it-scale.per_thread_ops 2.6% regression
Date: Wed, 10 Dec 2025 14:15:49 +0800	[thread overview]
Message-ID: <202512101408.af3876df-lkp@intel.com> (raw)



Hello,

kernel test robot noticed a 2.6% regression of will-it-scale.per_thread_ops on:


commit: 7e44d00a13ca5691caf4f7c46541ee60bf75b208 ("memcg: use mod_node_page_state to update stats")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

[still regression on linux-next/master 6987d58a9cbc5bd57c983baa514474a86c945d56]

testcase: will-it-scale
config: x86_64-rhel-9.4
compiler: gcc-14
test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
parameters:

	nr_task: 100%
	mode: thread
	test: page_fault2
	cpufreq_governor: performance




If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202512101408.af3876df-lkp@intel.com


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20251210/202512101408.af3876df-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-14/performance/x86_64-rhel-9.4/thread/100%/debian-13-x86_64-20250902.cgz/lkp-icl-2sp7/page_fault2/will-it-scale

commit: 
  3e700b715e ("selftests/mm: gup_test: fix comment regarding origin of FOLL_WRITE")
  7e44d00a13 ("memcg: use mod_node_page_state to update stats")

3e700b715e1cef66 7e44d00a13ca5691caf4f7c4654 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
   3453930            -2.6%    3363916        will-it-scale.64.threads
     53967            -2.6%      52560        will-it-scale.per_thread_ops
   3453930            -2.6%    3363916        will-it-scale.workload
 1.053e+09            -2.6%  1.025e+09        proc-vmstat.numa_hit
 1.052e+09            -2.6%  1.025e+09        proc-vmstat.numa_local
  1.05e+09            -2.6%  1.023e+09        proc-vmstat.pgalloc_normal
 1.045e+09            -2.6%  1.018e+09        proc-vmstat.pgfault
  1.05e+09            -2.6%  1.023e+09        proc-vmstat.pgfree
 3.452e+09            -2.0%  3.383e+09        perf-stat.i.branch-instructions
      0.45            +0.0        0.46        perf-stat.i.branch-miss-rate%
 4.559e+08            -2.5%  4.446e+08        perf-stat.i.cache-misses
 4.696e+08            -2.5%   4.58e+08        perf-stat.i.cache-references
  3.88e+10            -2.4%  3.787e+10        perf-stat.i.cpu-cycles
 1.741e+10            -1.5%  1.715e+10        perf-stat.i.instructions
    107.43            -2.5%     104.76        perf-stat.i.metric.K/sec
   3437960            -2.5%    3352362        perf-stat.i.minor-faults
   3437961            -2.5%    3352362        perf-stat.i.page-faults
     26.18           -34.0%      17.29 ± 70%  perf-stat.overall.MPKI
 3.441e+09           -34.7%  2.247e+09 ± 70%  perf-stat.ps.branch-instructions
 4.544e+08           -35.0%  2.953e+08 ± 70%  perf-stat.ps.cache-misses
  4.68e+08           -35.0%  3.042e+08 ± 70%  perf-stat.ps.cache-references
 3.867e+10           -34.9%  2.517e+10 ± 70%  perf-stat.ps.cpu-cycles
 1.736e+10           -34.4%  1.139e+10 ± 70%  perf-stat.ps.instructions
   3426140           -35.0%    2226448 ± 70%  perf-stat.ps.minor-faults
   3426140           -35.0%    2226448 ± 70%  perf-stat.ps.page-faults
 5.293e+12           -34.4%  3.471e+12 ± 70%  perf-stat.total.instructions
     92.62            -0.2       92.40        perf-profile.calltrace.cycles-pp.asm_exc_page_fault.testcase
      1.39            +0.0        1.42        perf-profile.calltrace.cycles-pp.__page_cache_release.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_flush_mmu
      1.89            +0.0        1.92        perf-profile.calltrace.cycles-pp.lru_add.folio_batch_move_lru.__folio_batch_add_and_move.set_pte_range.finish_fault
      0.91            +0.0        0.95        perf-profile.calltrace.cycles-pp.lru_gen_del_folio.__page_cache_release.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages
      1.30            +0.0        1.35        perf-profile.calltrace.cycles-pp.folio_remove_rmap_ptes.zap_present_ptes.zap_pte_range.zap_pmd_range.unmap_page_range
      2.38            +0.0        2.43        perf-profile.calltrace.cycles-pp.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_flush_mmu.zap_pte_range
      1.36            +0.1        1.41        perf-profile.calltrace.cycles-pp.lru_gen_add_folio.lru_add.folio_batch_move_lru.__folio_batch_add_and_move.set_pte_range
      4.49            +0.2        4.64        perf-profile.calltrace.cycles-pp.zap_present_ptes.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
      7.62            +0.2        7.79        perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.vms_clear_ptes.vms_complete_munmap_vmas.do_vmi_align_munmap
      7.62            +0.2        7.79        perf-profile.calltrace.cycles-pp.unmap_vmas.vms_clear_ptes.vms_complete_munmap_vmas.do_vmi_align_munmap.do_vmi_munmap
      7.62            +0.2        7.79        perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.vms_clear_ptes.vms_complete_munmap_vmas
      7.61            +0.2        7.78        perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.vms_clear_ptes
      8.17            +0.2        8.34        perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
      8.17            +0.2        8.34        perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
      8.14            +0.2        8.31        perf-profile.calltrace.cycles-pp.vms_clear_ptes.vms_complete_munmap_vmas.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap
      8.14            +0.2        8.32        perf-profile.calltrace.cycles-pp.vms_complete_munmap_vmas.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap
      8.17            +0.2        8.34        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
      8.17            +0.2        8.34        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__munmap
      8.17            +0.2        8.34        perf-profile.calltrace.cycles-pp.__munmap
      8.15            +0.2        8.32        perf-profile.calltrace.cycles-pp.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe
      8.15            +0.2        8.32        perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64
     21.20            +0.3       21.46        perf-profile.calltrace.cycles-pp.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
     90.18            -0.2       89.94        perf-profile.children.cycles-pp.testcase
     86.53            -0.2       86.34        perf-profile.children.cycles-pp.asm_exc_page_fault
      1.58            +0.0        1.63        perf-profile.children.cycles-pp.__page_cache_release
      1.39            +0.0        1.44        perf-profile.children.cycles-pp.lru_gen_add_folio
      1.07            +0.0        1.12        perf-profile.children.cycles-pp.lru_gen_del_folio
      1.33            +0.0        1.38        perf-profile.children.cycles-pp.folio_remove_rmap_ptes
      1.42            +0.1        1.48 ±  2%  perf-profile.children.cycles-pp.__lruvec_stat_mod_folio
      0.36 ±  2%      +0.1        0.42        perf-profile.children.cycles-pp.__mod_lruvec_state
      3.08            +0.1        3.16        perf-profile.children.cycles-pp.folios_put_refs
      4.56            +0.1        4.71        perf-profile.children.cycles-pp.zap_present_ptes
      7.64            +0.2        7.80        perf-profile.children.cycles-pp.unmap_page_range
      7.64            +0.2        7.80        perf-profile.children.cycles-pp.unmap_vmas
      7.64            +0.2        7.80        perf-profile.children.cycles-pp.zap_pmd_range
      7.64            +0.2        7.80        perf-profile.children.cycles-pp.zap_pte_range
      8.17            +0.2        8.34        perf-profile.children.cycles-pp.__x64_sys_munmap
      8.14            +0.2        8.31        perf-profile.children.cycles-pp.vms_clear_ptes
      8.17            +0.2        8.34        perf-profile.children.cycles-pp.__vm_munmap
      8.15            +0.2        8.32        perf-profile.children.cycles-pp.vms_complete_munmap_vmas
      8.17            +0.2        8.34        perf-profile.children.cycles-pp.__munmap
      8.15            +0.2        8.32        perf-profile.children.cycles-pp.do_vmi_align_munmap
      8.15            +0.2        8.33        perf-profile.children.cycles-pp.do_vmi_munmap
      8.41            +0.2        8.59        perf-profile.children.cycles-pp.do_syscall_64
      8.41            +0.2        8.59        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     21.41            +0.3       21.67        perf-profile.children.cycles-pp.finish_fault
      0.00            +0.8        0.78        perf-profile.children.cycles-pp.mod_node_page_state
      0.36 ±  3%      -0.0        0.34 ±  2%  perf-profile.self.cycles-pp.free_pages_and_swap_cache
      0.53 ±  2%      -0.0        0.50        perf-profile.self.cycles-pp.do_user_addr_fault
      0.18 ±  2%      -0.0        0.15 ±  2%  perf-profile.self.cycles-pp.__page_cache_release
      0.49            +0.0        0.53 ±  3%  perf-profile.self.cycles-pp.folios_put_refs
      3.07            +0.1        3.17        perf-profile.self.cycles-pp.zap_present_ptes
      0.00            +0.7        0.73        perf-profile.self.cycles-pp.mod_node_page_state




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


                 reply	other threads:[~2025-12-10  6:16 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=202512101408.af3876df-lkp@intel.com \
    --to=oliver.sang@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=harry.yoo@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lkp@intel.com \
    --cc=mhocko@kernel.org \
    --cc=muchun.song@linux.dev \
    --cc=oe-lkp@lists.linux.dev \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeel.butt@linux.dev \
    --cc=vbabka@suse.cz \
    --cc=zhengqi.arch@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.