public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
* [linus:master] [slab]  5ba6bc27b1: stress-ng.session.ops_per_sec 3.4% regression
@ 2026-04-27  8:35 kernel test robot
  0 siblings, 0 replies; only message in thread
From: kernel test robot @ 2026-04-27  8:35 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: oe-lkp, lkp, linux-kernel, Harry Yoo, Hao Li, linux-mm,
	oliver.sang



Hello,

kernel test robot noticed a 3.4% regression of stress-ng.session.ops_per_sec on:


commit: 5ba6bc27b1f99b35aa528409a8e223136c59e0af ("slab: decouple pointer to barn from kmem_cache_node")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

[still regression on linus/master      1d51b370a0f8f642f4fc84c795fbedac0fcdbbd2]
[still regression on linux-next/master 936c21068d7ade00325e40d82bfd2f3f29d9f659]

testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-14
test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory
parameters:

	nr_threads: 100%
	testtime: 60s
	test: session
	cpufreq_governor: performance


If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202604271639.21c44b96-lkp@intel.com


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20260427/202604271639.21c44b96-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  gcc-14/performance/x86_64-rhel-9.4/100%/debian-13-x86_64-20250902.cgz/lkp-spr-r02/session/stress-ng/60s

commit: 
  69d73421b7 ("slab: remove alloc_full_sheaf()")
  5ba6bc27b1 ("slab: decouple pointer to barn from kmem_cache_node")

69d73421b76e3d95 5ba6bc27b1f99b35aa528409a8e 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    721150            -3.4%     696311        stress-ng.session.ops
     12027            -3.4%      11614        stress-ng.session.ops_per_sec
   2757163            -3.6%    2658513        stress-ng.time.voluntary_context_switches
     37.76            -1.4%      37.23        turbostat.RAMWatt
    116476            -3.5%     112365        vmstat.system.cs
      6.33            -1.7%       6.22        perf-stat.i.MPKI
      0.52            -0.0        0.50        perf-stat.i.branch-miss-rate%
  96378014            -4.2%   92350977        perf-stat.i.branch-misses
 5.781e+08            -3.1%  5.603e+08        perf-stat.i.cache-misses
 1.298e+09            -3.1%  1.257e+09        perf-stat.i.cache-references
    121290            -3.5%     117042        perf-stat.i.context-switches
      6.34            +2.3%       6.49        perf-stat.i.cpi
      1002            +4.1%       1043        perf-stat.i.cycles-between-cache-misses
 9.116e+10            -1.4%  8.985e+10        perf-stat.i.instructions
      0.16            -2.3%       0.16        perf-stat.i.ipc
      6.38            -1.7%       6.27        perf-stat.overall.MPKI
      0.50            -0.0        0.48        perf-stat.overall.branch-miss-rate%
      6.39            +2.3%       6.53        perf-stat.overall.cpi
      1000            +4.1%       1041        perf-stat.overall.cycles-between-cache-misses
      0.16            -2.2%       0.15        perf-stat.overall.ipc
  91907282            -3.8%   88452571        perf-stat.ps.branch-misses
 5.614e+08            -2.7%  5.462e+08        perf-stat.ps.cache-misses
 1.258e+09            -2.8%  1.223e+09        perf-stat.ps.cache-references
    117705            -3.2%     113987        perf-stat.ps.context-switches
      9.20            -0.3        8.85        perf-profile.calltrace.cycles-pp._Fork
      8.95            -0.3        8.62        perf-profile.calltrace.cycles-pp.__do_sys_clone.do_syscall_64.entry_SYSCALL_64_after_hwframe._Fork
      8.95            -0.3        8.62        perf-profile.calltrace.cycles-pp.kernel_clone.__do_sys_clone.do_syscall_64.entry_SYSCALL_64_after_hwframe._Fork
      8.95            -0.3        8.63        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe._Fork
      8.95            -0.3        8.63        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe._Fork
      7.83            -0.3        7.51        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
      7.83            -0.3        7.51        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
      8.66            -0.3        8.36 ±  2%  perf-profile.calltrace.cycles-pp.copy_process.kernel_clone.__do_sys_clone.do_syscall_64.entry_SYSCALL_64_after_hwframe
      7.48            -0.3        7.18        perf-profile.calltrace.cycles-pp.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call.do_syscall_64
      8.45            -0.3        8.15 ±  2%  perf-profile.calltrace.cycles-pp.dup_mm.copy_process.kernel_clone.__do_sys_clone.do_syscall_64
      7.48            -0.3        7.18        perf-profile.calltrace.cycles-pp.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe
      7.48            -0.3        7.18        perf-profile.calltrace.cycles-pp.__x64_sys_exit_group.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe
      7.48            -0.3        7.18        perf-profile.calltrace.cycles-pp.do_group_exit.__x64_sys_exit_group.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe
      7.28            -0.3        6.99        perf-profile.calltrace.cycles-pp.exit_mm.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call
      7.28            -0.3        6.99        perf-profile.calltrace.cycles-pp.__mmput.exit_mm.do_exit.do_group_exit.__x64_sys_exit_group
      7.27            -0.3        6.98        perf-profile.calltrace.cycles-pp.exit_mmap.__mmput.exit_mm.do_exit.do_group_exit
      8.08            -0.3        7.80 ±  2%  perf-profile.calltrace.cycles-pp.dup_mmap.dup_mm.copy_process.kernel_clone.__do_sys_clone
      1.66 ±  2%      -0.1        1.55        perf-profile.calltrace.cycles-pp.unmap_vmas.exit_mmap.__mmput.exit_mm.do_exit
      1.54 ±  2%      -0.1        1.43        perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.exit_mmap.__mmput.exit_mm
      1.50 ±  2%      -0.1        1.40        perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.exit_mmap.__mmput
      1.48 ±  3%      -0.1        1.37        perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.exit_mmap
      1.10 ±  3%      -0.1        1.01        perf-profile.calltrace.cycles-pp.zap_present_ptes.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
      1.38            -0.1        1.30        perf-profile.calltrace.cycles-pp.online_fair_sched_group.sched_autogroup_create_attach.ksys_setsid.__x64_sys_setsid.do_syscall_64
      1.04            -0.0        1.00        perf-profile.calltrace.cycles-pp.anon_vma_interval_tree_insert.anon_vma_clone.anon_vma_fork.dup_mmap.dup_mm
      0.95            +0.1        1.01        perf-profile.calltrace.cycles-pp.__irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt._raw_spin_unlock_irqrestore.__refill_objects_node
      0.97            +0.1        1.03        perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt._raw_spin_unlock_irqrestore.__refill_objects_node.refill_objects
      0.97            +0.1        1.03 ±  2%  perf-profile.calltrace.cycles-pp._raw_spin_unlock_irqrestore.__refill_objects_node.refill_objects.__pcs_replace_empty_main.__kmalloc_cache_node_noprof
      0.97            +0.1        1.03        perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt._raw_spin_unlock_irqrestore.__refill_objects_node.refill_objects.__pcs_replace_empty_main
      3.46            +0.1        3.60        perf-profile.calltrace.cycles-pp.__slab_free.kfree.free_fair_sched_group.sched_free_group_rcu.rcu_do_batch
      2.98            +0.1        3.12        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.__slab_free.__refill_objects_node.refill_objects
      3.00            +0.1        3.15        perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.__slab_free.__refill_objects_node.refill_objects.__pcs_replace_empty_main
      3.04            +0.2        3.19        perf-profile.calltrace.cycles-pp.__slab_free.__refill_objects_node.refill_objects.__pcs_replace_empty_main.__kmalloc_cache_node_noprof
      5.20            +0.2        5.36        perf-profile.calltrace.cycles-pp.free_fair_sched_group.sched_free_group_rcu.rcu_do_batch.rcu_core.handle_softirqs
      5.20            +0.2        5.37        perf-profile.calltrace.cycles-pp.sched_free_group_rcu.rcu_do_batch.rcu_core.handle_softirqs.__irq_exit_rcu
      5.18            +0.2        5.36        perf-profile.calltrace.cycles-pp.kfree.free_fair_sched_group.sched_free_group_rcu.rcu_do_batch.rcu_core
      4.78            +0.2        4.96        perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.__refill_objects_node.refill_objects.__pcs_replace_empty_main.__kmalloc_cache_node_noprof
      4.76            +0.2        4.94        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.__refill_objects_node.refill_objects.__pcs_replace_empty_main
      9.20            +0.4        9.59        perf-profile.calltrace.cycles-pp.__pcs_replace_empty_main.__kmalloc_cache_node_noprof.alloc_fair_sched_group.sched_create_group.sched_autogroup_create_attach
      9.13            +0.4        9.52        perf-profile.calltrace.cycles-pp.__refill_objects_node.refill_objects.__pcs_replace_empty_main.__kmalloc_cache_node_noprof.alloc_fair_sched_group
      9.15            +0.4        9.55        perf-profile.calltrace.cycles-pp.refill_objects.__pcs_replace_empty_main.__kmalloc_cache_node_noprof.alloc_fair_sched_group.sched_create_group
     60.96            +0.4       61.41        perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.get_from_partial_node.___slab_alloc.__kmalloc_cache_node_noprof.alloc_fair_sched_group
     67.38            +0.5       67.85        perf-profile.calltrace.cycles-pp.get_from_partial_node.___slab_alloc.__kmalloc_cache_node_noprof.alloc_fair_sched_group.sched_create_group
     60.37            +0.5       60.85        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.get_from_partial_node.___slab_alloc.__kmalloc_cache_node_noprof
     67.95            +0.5       68.46        perf-profile.calltrace.cycles-pp.___slab_alloc.__kmalloc_cache_node_noprof.alloc_fair_sched_group.sched_create_group.sched_autogroup_create_attach
      2.67 ±  7%      +0.6        3.24        perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.__slab_free.kfree.free_fair_sched_group.sched_free_group_rcu
      2.65 ±  7%      +0.6        3.22        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.__slab_free.kfree.free_fair_sched_group
     79.70            +0.8       80.48        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.setsid
     79.70            +0.8       80.48        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.setsid
     79.71            +0.8       80.48        perf-profile.calltrace.cycles-pp.setsid
     79.70            +0.8       80.47        perf-profile.calltrace.cycles-pp.__x64_sys_setsid.do_syscall_64.entry_SYSCALL_64_after_hwframe.setsid
     79.70            +0.8       80.47        perf-profile.calltrace.cycles-pp.ksys_setsid.__x64_sys_setsid.do_syscall_64.entry_SYSCALL_64_after_hwframe.setsid
     79.68            +0.8       80.45        perf-profile.calltrace.cycles-pp.sched_autogroup_create_attach.ksys_setsid.__x64_sys_setsid.do_syscall_64.entry_SYSCALL_64_after_hwframe
     78.25            +0.9       79.10        perf-profile.calltrace.cycles-pp.sched_create_group.sched_autogroup_create_attach.ksys_setsid.__x64_sys_setsid.do_syscall_64
     78.21            +0.9       79.06        perf-profile.calltrace.cycles-pp.alloc_fair_sched_group.sched_create_group.sched_autogroup_create_attach.ksys_setsid.__x64_sys_setsid
     77.68            +0.9       78.56        perf-profile.calltrace.cycles-pp.__kmalloc_cache_node_noprof.alloc_fair_sched_group.sched_create_group.sched_autogroup_create_attach.ksys_setsid
      9.21            -0.3        8.87        perf-profile.children.cycles-pp._Fork
      8.95            -0.3        8.62        perf-profile.children.cycles-pp.__do_sys_clone
      8.95            -0.3        8.62        perf-profile.children.cycles-pp.kernel_clone
      8.66            -0.3        8.36 ±  2%  perf-profile.children.cycles-pp.copy_process
      7.61            -0.3        7.30        perf-profile.children.cycles-pp.do_exit
      7.61            -0.3        7.30        perf-profile.children.cycles-pp.x64_sys_call
      7.61            -0.3        7.30        perf-profile.children.cycles-pp.__x64_sys_exit_group
      7.61            -0.3        7.30        perf-profile.children.cycles-pp.do_group_exit
      8.45            -0.3        8.15 ±  2%  perf-profile.children.cycles-pp.dup_mm
      7.29            -0.3        7.00        perf-profile.children.cycles-pp.exit_mm
      7.28            -0.3        6.99        perf-profile.children.cycles-pp.__mmput
      7.27            -0.3        6.98        perf-profile.children.cycles-pp.exit_mmap
      8.12            -0.3        7.83 ±  2%  perf-profile.children.cycles-pp.dup_mmap
      1.67 ±  2%      -0.1        1.55        perf-profile.children.cycles-pp.unmap_vmas
      1.51 ±  2%      -0.1        1.40        perf-profile.children.cycles-pp.zap_pmd_range
      1.54 ±  2%      -0.1        1.43        perf-profile.children.cycles-pp.unmap_page_range
      1.49 ±  3%      -0.1        1.38        perf-profile.children.cycles-pp.zap_pte_range
      1.52 ±  3%      -0.1        1.42 ±  3%  perf-profile.children.cycles-pp.exc_page_fault
      1.13 ±  3%      -0.1        1.04        perf-profile.children.cycles-pp.zap_present_ptes
      1.38            -0.1        1.31        perf-profile.children.cycles-pp.online_fair_sched_group
      0.77            -0.1        0.72 ±  2%  perf-profile.children.cycles-pp.rwsem_spin_on_owner
      0.76            -0.0        0.71        perf-profile.children.cycles-pp.__vma_start_write
      1.05            -0.0        1.00        perf-profile.children.cycles-pp.anon_vma_interval_tree_insert
      0.80            -0.0        0.76        perf-profile.children.cycles-pp.sched_unregister_group_rcu
      0.80            -0.0        0.76        perf-profile.children.cycles-pp.unregister_fair_sched_group
      0.66            -0.0        0.62        perf-profile.children.cycles-pp._raw_spin_lock
      0.64            -0.0        0.61        perf-profile.children.cycles-pp.remove_entity_load_avg
      0.80            -0.0        0.77        perf-profile.children.cycles-pp.kmem_cache_alloc_noprof
      0.44            -0.0        0.41 ±  2%  perf-profile.children.cycles-pp.__vma_start_exclude_readers
      0.64            -0.0        0.61        perf-profile.children.cycles-pp.kmem_cache_free
      0.50            -0.0        0.47        perf-profile.children.cycles-pp.raw_spin_rq_lock_nested
      0.40            -0.0        0.37        perf-profile.children.cycles-pp.attach_entity_cfs_rq
      0.39            -0.0        0.37        perf-profile.children.cycles-pp.__pi_memset
      0.37            -0.0        0.35        perf-profile.children.cycles-pp.tear_down_vmas
      0.40            -0.0        0.38        perf-profile.children.cycles-pp.up_write
      0.38            -0.0        0.36        perf-profile.children.cycles-pp.__anon_vma_interval_tree_remove
      0.27            -0.0        0.25        perf-profile.children.cycles-pp.wake_up_new_task
      0.21            -0.0        0.19 ±  2%  perf-profile.children.cycles-pp.sched_balance_find_dst_group
      0.38            -0.0        0.37        perf-profile.children.cycles-pp.update_rq_clock_task
      0.17 ±  2%      -0.0        0.16 ±  2%  perf-profile.children.cycles-pp.wp_page_copy
      0.30            -0.0        0.29        perf-profile.children.cycles-pp.init_tg_cfs_entry
      0.24            -0.0        0.23        perf-profile.children.cycles-pp.select_task_rq_fair
      0.32            -0.0        0.31        perf-profile.children.cycles-pp.__schedule
      0.17            -0.0        0.16 ±  2%  perf-profile.children.cycles-pp.__rb_erase_color
      0.20            -0.0        0.19        perf-profile.children.cycles-pp.__ordered_events__flush
      0.09            -0.0        0.08        perf-profile.children.cycles-pp.__update_load_avg_cfs_rq
      0.20            -0.0        0.19        perf-profile.children.cycles-pp.perf_session__process_user_event
      0.14            -0.0        0.13        perf-profile.children.cycles-pp.schedule_tail
      0.20            -0.0        0.19        perf-profile.children.cycles-pp.update_sg_wakeup_stats
      0.24            +0.0        0.25        perf-profile.children.cycles-pp.on_each_cpu_cond_mask
      0.24            +0.0        0.25        perf-profile.children.cycles-pp.smp_call_function_many_cond
      6.34            +0.1        6.46        perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore
      6.53            +0.2        6.71        perf-profile.children.cycles-pp.free_fair_sched_group
      6.53            +0.2        6.71        perf-profile.children.cycles-pp.sched_free_group_rcu
      6.53            +0.2        6.71        perf-profile.children.cycles-pp.kfree
      9.37            +0.3        9.71        perf-profile.children.cycles-pp.__slab_free
      9.61            +0.4        9.98        perf-profile.children.cycles-pp.__pcs_replace_empty_main
      9.54            +0.4        9.92        perf-profile.children.cycles-pp.__refill_objects_node
      9.56            +0.4        9.94        perf-profile.children.cycles-pp.refill_objects
     67.38            +0.5       67.85        perf-profile.children.cycles-pp.get_from_partial_node
     67.96            +0.5       68.46        perf-profile.children.cycles-pp.___slab_alloc
     79.71            +0.8       80.48        perf-profile.children.cycles-pp.setsid
     79.70            +0.8       80.47        perf-profile.children.cycles-pp.__x64_sys_setsid
     79.70            +0.8       80.47        perf-profile.children.cycles-pp.ksys_setsid
     79.68            +0.8       80.45        perf-profile.children.cycles-pp.sched_autogroup_create_attach
     78.25            +0.9       79.10        perf-profile.children.cycles-pp.alloc_fair_sched_group
     78.25            +0.9       79.10        perf-profile.children.cycles-pp.sched_create_group
     77.80            +0.9       78.67        perf-profile.children.cycles-pp.__kmalloc_cache_node_noprof
     75.00            +1.0       76.04        perf-profile.children.cycles-pp._raw_spin_lock_irqsave
     74.09            +1.1       75.15        perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
      0.62 ±  3%      -0.1        0.56 ±  3%  perf-profile.self.cycles-pp.zap_present_ptes
      0.73            -0.0        0.68 ±  3%  perf-profile.self.cycles-pp.rwsem_spin_on_owner
      1.14            -0.0        1.09        perf-profile.self.cycles-pp._raw_spin_lock_irqsave
      0.99            -0.0        0.95        perf-profile.self.cycles-pp.anon_vma_interval_tree_insert
      1.14            -0.0        1.09        perf-profile.self.cycles-pp.get_from_partial_node
      0.78            -0.0        0.75        perf-profile.self.cycles-pp.__slab_free
      0.41            -0.0        0.38 ±  2%  perf-profile.self.cycles-pp.__vma_start_exclude_readers
      0.51            -0.0        0.48        perf-profile.self.cycles-pp._raw_spin_lock
      0.37            -0.0        0.35        perf-profile.self.cycles-pp.__pi_memset
      0.32            -0.0        0.30        perf-profile.self.cycles-pp.kfree
      0.33            -0.0        0.32        perf-profile.self.cycles-pp.dup_mmap
      0.37            -0.0        0.35        perf-profile.self.cycles-pp.up_write
      0.35            -0.0        0.33        perf-profile.self.cycles-pp.update_rq_clock_task
      0.44            -0.0        0.43        perf-profile.self.cycles-pp.__kmalloc_cache_node_noprof
      0.29            -0.0        0.27        perf-profile.self.cycles-pp.init_tg_cfs_entry
      0.36            -0.0        0.34        perf-profile.self.cycles-pp.___slab_alloc
      0.25            -0.0        0.23 ±  2%  perf-profile.self.cycles-pp.anon_vma_clone
      0.35            -0.0        0.34        perf-profile.self.cycles-pp.__anon_vma_interval_tree_remove
      0.17            -0.0        0.16        perf-profile.self.cycles-pp.__memcg_slab_post_alloc_hook
      0.17            -0.0        0.16        perf-profile.self.cycles-pp.update_sg_wakeup_stats
      0.06            -0.0        0.05        perf-profile.self.cycles-pp._find_next_bit
      0.21            +0.0        0.22        perf-profile.self.cycles-pp.queue_event
      0.21 ±  2%      +0.0        0.23        perf-profile.self.cycles-pp.smp_call_function_many_cond
     74.08            +1.1       75.14        perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki



^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2026-04-27  8:36 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-27  8:35 [linus:master] [slab] 5ba6bc27b1: stress-ng.session.ops_per_sec 3.4% regression kernel test robot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox