* [linus:master] [slab] 5ba6bc27b1: stress-ng.session.ops_per_sec 3.4% regression
@ 2026-04-27 8:35 kernel test robot
0 siblings, 0 replies; only message in thread
From: kernel test robot @ 2026-04-27 8:35 UTC (permalink / raw)
To: Vlastimil Babka
Cc: oe-lkp, lkp, linux-kernel, Harry Yoo, Hao Li, linux-mm,
oliver.sang
Hello,
kernel test robot noticed a 3.4% regression of stress-ng.session.ops_per_sec on:
commit: 5ba6bc27b1f99b35aa528409a8e223136c59e0af ("slab: decouple pointer to barn from kmem_cache_node")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
[still regression on linus/master 1d51b370a0f8f642f4fc84c795fbedac0fcdbbd2]
[still regression on linux-next/master 936c21068d7ade00325e40d82bfd2f3f29d9f659]
testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-14
test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory
parameters:
nr_threads: 100%
testtime: 60s
test: session
cpufreq_governor: performance
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202604271639.21c44b96-lkp@intel.com
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20260427/202604271639.21c44b96-lkp@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-14/performance/x86_64-rhel-9.4/100%/debian-13-x86_64-20250902.cgz/lkp-spr-r02/session/stress-ng/60s
commit:
69d73421b7 ("slab: remove alloc_full_sheaf()")
5ba6bc27b1 ("slab: decouple pointer to barn from kmem_cache_node")
69d73421b76e3d95 5ba6bc27b1f99b35aa528409a8e
---------------- ---------------------------
%stddev %change %stddev
\ | \
721150 -3.4% 696311 stress-ng.session.ops
12027 -3.4% 11614 stress-ng.session.ops_per_sec
2757163 -3.6% 2658513 stress-ng.time.voluntary_context_switches
37.76 -1.4% 37.23 turbostat.RAMWatt
116476 -3.5% 112365 vmstat.system.cs
6.33 -1.7% 6.22 perf-stat.i.MPKI
0.52 -0.0 0.50 perf-stat.i.branch-miss-rate%
96378014 -4.2% 92350977 perf-stat.i.branch-misses
5.781e+08 -3.1% 5.603e+08 perf-stat.i.cache-misses
1.298e+09 -3.1% 1.257e+09 perf-stat.i.cache-references
121290 -3.5% 117042 perf-stat.i.context-switches
6.34 +2.3% 6.49 perf-stat.i.cpi
1002 +4.1% 1043 perf-stat.i.cycles-between-cache-misses
9.116e+10 -1.4% 8.985e+10 perf-stat.i.instructions
0.16 -2.3% 0.16 perf-stat.i.ipc
6.38 -1.7% 6.27 perf-stat.overall.MPKI
0.50 -0.0 0.48 perf-stat.overall.branch-miss-rate%
6.39 +2.3% 6.53 perf-stat.overall.cpi
1000 +4.1% 1041 perf-stat.overall.cycles-between-cache-misses
0.16 -2.2% 0.15 perf-stat.overall.ipc
91907282 -3.8% 88452571 perf-stat.ps.branch-misses
5.614e+08 -2.7% 5.462e+08 perf-stat.ps.cache-misses
1.258e+09 -2.8% 1.223e+09 perf-stat.ps.cache-references
117705 -3.2% 113987 perf-stat.ps.context-switches
9.20 -0.3 8.85 perf-profile.calltrace.cycles-pp._Fork
8.95 -0.3 8.62 perf-profile.calltrace.cycles-pp.__do_sys_clone.do_syscall_64.entry_SYSCALL_64_after_hwframe._Fork
8.95 -0.3 8.62 perf-profile.calltrace.cycles-pp.kernel_clone.__do_sys_clone.do_syscall_64.entry_SYSCALL_64_after_hwframe._Fork
8.95 -0.3 8.63 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe._Fork
8.95 -0.3 8.63 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe._Fork
7.83 -0.3 7.51 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
7.83 -0.3 7.51 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
8.66 -0.3 8.36 ± 2% perf-profile.calltrace.cycles-pp.copy_process.kernel_clone.__do_sys_clone.do_syscall_64.entry_SYSCALL_64_after_hwframe
7.48 -0.3 7.18 perf-profile.calltrace.cycles-pp.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call.do_syscall_64
8.45 -0.3 8.15 ± 2% perf-profile.calltrace.cycles-pp.dup_mm.copy_process.kernel_clone.__do_sys_clone.do_syscall_64
7.48 -0.3 7.18 perf-profile.calltrace.cycles-pp.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe
7.48 -0.3 7.18 perf-profile.calltrace.cycles-pp.__x64_sys_exit_group.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe
7.48 -0.3 7.18 perf-profile.calltrace.cycles-pp.do_group_exit.__x64_sys_exit_group.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe
7.28 -0.3 6.99 perf-profile.calltrace.cycles-pp.exit_mm.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call
7.28 -0.3 6.99 perf-profile.calltrace.cycles-pp.__mmput.exit_mm.do_exit.do_group_exit.__x64_sys_exit_group
7.27 -0.3 6.98 perf-profile.calltrace.cycles-pp.exit_mmap.__mmput.exit_mm.do_exit.do_group_exit
8.08 -0.3 7.80 ± 2% perf-profile.calltrace.cycles-pp.dup_mmap.dup_mm.copy_process.kernel_clone.__do_sys_clone
1.66 ± 2% -0.1 1.55 perf-profile.calltrace.cycles-pp.unmap_vmas.exit_mmap.__mmput.exit_mm.do_exit
1.54 ± 2% -0.1 1.43 perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.exit_mmap.__mmput.exit_mm
1.50 ± 2% -0.1 1.40 perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.exit_mmap.__mmput
1.48 ± 3% -0.1 1.37 perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.exit_mmap
1.10 ± 3% -0.1 1.01 perf-profile.calltrace.cycles-pp.zap_present_ptes.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
1.38 -0.1 1.30 perf-profile.calltrace.cycles-pp.online_fair_sched_group.sched_autogroup_create_attach.ksys_setsid.__x64_sys_setsid.do_syscall_64
1.04 -0.0 1.00 perf-profile.calltrace.cycles-pp.anon_vma_interval_tree_insert.anon_vma_clone.anon_vma_fork.dup_mmap.dup_mm
0.95 +0.1 1.01 perf-profile.calltrace.cycles-pp.__irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt._raw_spin_unlock_irqrestore.__refill_objects_node
0.97 +0.1 1.03 perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt._raw_spin_unlock_irqrestore.__refill_objects_node.refill_objects
0.97 +0.1 1.03 ± 2% perf-profile.calltrace.cycles-pp._raw_spin_unlock_irqrestore.__refill_objects_node.refill_objects.__pcs_replace_empty_main.__kmalloc_cache_node_noprof
0.97 +0.1 1.03 perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt._raw_spin_unlock_irqrestore.__refill_objects_node.refill_objects.__pcs_replace_empty_main
3.46 +0.1 3.60 perf-profile.calltrace.cycles-pp.__slab_free.kfree.free_fair_sched_group.sched_free_group_rcu.rcu_do_batch
2.98 +0.1 3.12 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.__slab_free.__refill_objects_node.refill_objects
3.00 +0.1 3.15 perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.__slab_free.__refill_objects_node.refill_objects.__pcs_replace_empty_main
3.04 +0.2 3.19 perf-profile.calltrace.cycles-pp.__slab_free.__refill_objects_node.refill_objects.__pcs_replace_empty_main.__kmalloc_cache_node_noprof
5.20 +0.2 5.36 perf-profile.calltrace.cycles-pp.free_fair_sched_group.sched_free_group_rcu.rcu_do_batch.rcu_core.handle_softirqs
5.20 +0.2 5.37 perf-profile.calltrace.cycles-pp.sched_free_group_rcu.rcu_do_batch.rcu_core.handle_softirqs.__irq_exit_rcu
5.18 +0.2 5.36 perf-profile.calltrace.cycles-pp.kfree.free_fair_sched_group.sched_free_group_rcu.rcu_do_batch.rcu_core
4.78 +0.2 4.96 perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.__refill_objects_node.refill_objects.__pcs_replace_empty_main.__kmalloc_cache_node_noprof
4.76 +0.2 4.94 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.__refill_objects_node.refill_objects.__pcs_replace_empty_main
9.20 +0.4 9.59 perf-profile.calltrace.cycles-pp.__pcs_replace_empty_main.__kmalloc_cache_node_noprof.alloc_fair_sched_group.sched_create_group.sched_autogroup_create_attach
9.13 +0.4 9.52 perf-profile.calltrace.cycles-pp.__refill_objects_node.refill_objects.__pcs_replace_empty_main.__kmalloc_cache_node_noprof.alloc_fair_sched_group
9.15 +0.4 9.55 perf-profile.calltrace.cycles-pp.refill_objects.__pcs_replace_empty_main.__kmalloc_cache_node_noprof.alloc_fair_sched_group.sched_create_group
60.96 +0.4 61.41 perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.get_from_partial_node.___slab_alloc.__kmalloc_cache_node_noprof.alloc_fair_sched_group
67.38 +0.5 67.85 perf-profile.calltrace.cycles-pp.get_from_partial_node.___slab_alloc.__kmalloc_cache_node_noprof.alloc_fair_sched_group.sched_create_group
60.37 +0.5 60.85 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.get_from_partial_node.___slab_alloc.__kmalloc_cache_node_noprof
67.95 +0.5 68.46 perf-profile.calltrace.cycles-pp.___slab_alloc.__kmalloc_cache_node_noprof.alloc_fair_sched_group.sched_create_group.sched_autogroup_create_attach
2.67 ± 7% +0.6 3.24 perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.__slab_free.kfree.free_fair_sched_group.sched_free_group_rcu
2.65 ± 7% +0.6 3.22 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.__slab_free.kfree.free_fair_sched_group
79.70 +0.8 80.48 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.setsid
79.70 +0.8 80.48 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.setsid
79.71 +0.8 80.48 perf-profile.calltrace.cycles-pp.setsid
79.70 +0.8 80.47 perf-profile.calltrace.cycles-pp.__x64_sys_setsid.do_syscall_64.entry_SYSCALL_64_after_hwframe.setsid
79.70 +0.8 80.47 perf-profile.calltrace.cycles-pp.ksys_setsid.__x64_sys_setsid.do_syscall_64.entry_SYSCALL_64_after_hwframe.setsid
79.68 +0.8 80.45 perf-profile.calltrace.cycles-pp.sched_autogroup_create_attach.ksys_setsid.__x64_sys_setsid.do_syscall_64.entry_SYSCALL_64_after_hwframe
78.25 +0.9 79.10 perf-profile.calltrace.cycles-pp.sched_create_group.sched_autogroup_create_attach.ksys_setsid.__x64_sys_setsid.do_syscall_64
78.21 +0.9 79.06 perf-profile.calltrace.cycles-pp.alloc_fair_sched_group.sched_create_group.sched_autogroup_create_attach.ksys_setsid.__x64_sys_setsid
77.68 +0.9 78.56 perf-profile.calltrace.cycles-pp.__kmalloc_cache_node_noprof.alloc_fair_sched_group.sched_create_group.sched_autogroup_create_attach.ksys_setsid
9.21 -0.3 8.87 perf-profile.children.cycles-pp._Fork
8.95 -0.3 8.62 perf-profile.children.cycles-pp.__do_sys_clone
8.95 -0.3 8.62 perf-profile.children.cycles-pp.kernel_clone
8.66 -0.3 8.36 ± 2% perf-profile.children.cycles-pp.copy_process
7.61 -0.3 7.30 perf-profile.children.cycles-pp.do_exit
7.61 -0.3 7.30 perf-profile.children.cycles-pp.x64_sys_call
7.61 -0.3 7.30 perf-profile.children.cycles-pp.__x64_sys_exit_group
7.61 -0.3 7.30 perf-profile.children.cycles-pp.do_group_exit
8.45 -0.3 8.15 ± 2% perf-profile.children.cycles-pp.dup_mm
7.29 -0.3 7.00 perf-profile.children.cycles-pp.exit_mm
7.28 -0.3 6.99 perf-profile.children.cycles-pp.__mmput
7.27 -0.3 6.98 perf-profile.children.cycles-pp.exit_mmap
8.12 -0.3 7.83 ± 2% perf-profile.children.cycles-pp.dup_mmap
1.67 ± 2% -0.1 1.55 perf-profile.children.cycles-pp.unmap_vmas
1.51 ± 2% -0.1 1.40 perf-profile.children.cycles-pp.zap_pmd_range
1.54 ± 2% -0.1 1.43 perf-profile.children.cycles-pp.unmap_page_range
1.49 ± 3% -0.1 1.38 perf-profile.children.cycles-pp.zap_pte_range
1.52 ± 3% -0.1 1.42 ± 3% perf-profile.children.cycles-pp.exc_page_fault
1.13 ± 3% -0.1 1.04 perf-profile.children.cycles-pp.zap_present_ptes
1.38 -0.1 1.31 perf-profile.children.cycles-pp.online_fair_sched_group
0.77 -0.1 0.72 ± 2% perf-profile.children.cycles-pp.rwsem_spin_on_owner
0.76 -0.0 0.71 perf-profile.children.cycles-pp.__vma_start_write
1.05 -0.0 1.00 perf-profile.children.cycles-pp.anon_vma_interval_tree_insert
0.80 -0.0 0.76 perf-profile.children.cycles-pp.sched_unregister_group_rcu
0.80 -0.0 0.76 perf-profile.children.cycles-pp.unregister_fair_sched_group
0.66 -0.0 0.62 perf-profile.children.cycles-pp._raw_spin_lock
0.64 -0.0 0.61 perf-profile.children.cycles-pp.remove_entity_load_avg
0.80 -0.0 0.77 perf-profile.children.cycles-pp.kmem_cache_alloc_noprof
0.44 -0.0 0.41 ± 2% perf-profile.children.cycles-pp.__vma_start_exclude_readers
0.64 -0.0 0.61 perf-profile.children.cycles-pp.kmem_cache_free
0.50 -0.0 0.47 perf-profile.children.cycles-pp.raw_spin_rq_lock_nested
0.40 -0.0 0.37 perf-profile.children.cycles-pp.attach_entity_cfs_rq
0.39 -0.0 0.37 perf-profile.children.cycles-pp.__pi_memset
0.37 -0.0 0.35 perf-profile.children.cycles-pp.tear_down_vmas
0.40 -0.0 0.38 perf-profile.children.cycles-pp.up_write
0.38 -0.0 0.36 perf-profile.children.cycles-pp.__anon_vma_interval_tree_remove
0.27 -0.0 0.25 perf-profile.children.cycles-pp.wake_up_new_task
0.21 -0.0 0.19 ± 2% perf-profile.children.cycles-pp.sched_balance_find_dst_group
0.38 -0.0 0.37 perf-profile.children.cycles-pp.update_rq_clock_task
0.17 ± 2% -0.0 0.16 ± 2% perf-profile.children.cycles-pp.wp_page_copy
0.30 -0.0 0.29 perf-profile.children.cycles-pp.init_tg_cfs_entry
0.24 -0.0 0.23 perf-profile.children.cycles-pp.select_task_rq_fair
0.32 -0.0 0.31 perf-profile.children.cycles-pp.__schedule
0.17 -0.0 0.16 ± 2% perf-profile.children.cycles-pp.__rb_erase_color
0.20 -0.0 0.19 perf-profile.children.cycles-pp.__ordered_events__flush
0.09 -0.0 0.08 perf-profile.children.cycles-pp.__update_load_avg_cfs_rq
0.20 -0.0 0.19 perf-profile.children.cycles-pp.perf_session__process_user_event
0.14 -0.0 0.13 perf-profile.children.cycles-pp.schedule_tail
0.20 -0.0 0.19 perf-profile.children.cycles-pp.update_sg_wakeup_stats
0.24 +0.0 0.25 perf-profile.children.cycles-pp.on_each_cpu_cond_mask
0.24 +0.0 0.25 perf-profile.children.cycles-pp.smp_call_function_many_cond
6.34 +0.1 6.46 perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore
6.53 +0.2 6.71 perf-profile.children.cycles-pp.free_fair_sched_group
6.53 +0.2 6.71 perf-profile.children.cycles-pp.sched_free_group_rcu
6.53 +0.2 6.71 perf-profile.children.cycles-pp.kfree
9.37 +0.3 9.71 perf-profile.children.cycles-pp.__slab_free
9.61 +0.4 9.98 perf-profile.children.cycles-pp.__pcs_replace_empty_main
9.54 +0.4 9.92 perf-profile.children.cycles-pp.__refill_objects_node
9.56 +0.4 9.94 perf-profile.children.cycles-pp.refill_objects
67.38 +0.5 67.85 perf-profile.children.cycles-pp.get_from_partial_node
67.96 +0.5 68.46 perf-profile.children.cycles-pp.___slab_alloc
79.71 +0.8 80.48 perf-profile.children.cycles-pp.setsid
79.70 +0.8 80.47 perf-profile.children.cycles-pp.__x64_sys_setsid
79.70 +0.8 80.47 perf-profile.children.cycles-pp.ksys_setsid
79.68 +0.8 80.45 perf-profile.children.cycles-pp.sched_autogroup_create_attach
78.25 +0.9 79.10 perf-profile.children.cycles-pp.alloc_fair_sched_group
78.25 +0.9 79.10 perf-profile.children.cycles-pp.sched_create_group
77.80 +0.9 78.67 perf-profile.children.cycles-pp.__kmalloc_cache_node_noprof
75.00 +1.0 76.04 perf-profile.children.cycles-pp._raw_spin_lock_irqsave
74.09 +1.1 75.15 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
0.62 ± 3% -0.1 0.56 ± 3% perf-profile.self.cycles-pp.zap_present_ptes
0.73 -0.0 0.68 ± 3% perf-profile.self.cycles-pp.rwsem_spin_on_owner
1.14 -0.0 1.09 perf-profile.self.cycles-pp._raw_spin_lock_irqsave
0.99 -0.0 0.95 perf-profile.self.cycles-pp.anon_vma_interval_tree_insert
1.14 -0.0 1.09 perf-profile.self.cycles-pp.get_from_partial_node
0.78 -0.0 0.75 perf-profile.self.cycles-pp.__slab_free
0.41 -0.0 0.38 ± 2% perf-profile.self.cycles-pp.__vma_start_exclude_readers
0.51 -0.0 0.48 perf-profile.self.cycles-pp._raw_spin_lock
0.37 -0.0 0.35 perf-profile.self.cycles-pp.__pi_memset
0.32 -0.0 0.30 perf-profile.self.cycles-pp.kfree
0.33 -0.0 0.32 perf-profile.self.cycles-pp.dup_mmap
0.37 -0.0 0.35 perf-profile.self.cycles-pp.up_write
0.35 -0.0 0.33 perf-profile.self.cycles-pp.update_rq_clock_task
0.44 -0.0 0.43 perf-profile.self.cycles-pp.__kmalloc_cache_node_noprof
0.29 -0.0 0.27 perf-profile.self.cycles-pp.init_tg_cfs_entry
0.36 -0.0 0.34 perf-profile.self.cycles-pp.___slab_alloc
0.25 -0.0 0.23 ± 2% perf-profile.self.cycles-pp.anon_vma_clone
0.35 -0.0 0.34 perf-profile.self.cycles-pp.__anon_vma_interval_tree_remove
0.17 -0.0 0.16 perf-profile.self.cycles-pp.__memcg_slab_post_alloc_hook
0.17 -0.0 0.16 perf-profile.self.cycles-pp.update_sg_wakeup_stats
0.06 -0.0 0.05 perf-profile.self.cycles-pp._find_next_bit
0.21 +0.0 0.22 perf-profile.self.cycles-pp.queue_event
0.21 ± 2% +0.0 0.23 perf-profile.self.cycles-pp.smp_call_function_many_cond
74.08 +1.1 75.14 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2026-04-27 8:36 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-27 8:35 [linus:master] [slab] 5ba6bc27b1: stress-ng.session.ops_per_sec 3.4% regression kernel test robot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox