From: kernel test robot <oliver.sang@intel.com>
To: zihan zhou <15645113830zzh@gmail.com>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
<linux-kernel@vger.kernel.org>, <x86@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Vincent Guittot <vincent.guittot@linaro.org>,
<aubrey.li@linux.intel.com>, <yu.c.chen@intel.com>,
<oliver.sang@intel.com>
Subject: [tip:sched/core] [sched] 2ae891b826: hackbench.throughput 6.2% regression
Date: Tue, 25 Feb 2025 10:32:13 +0800 [thread overview]
Message-ID: <202502251026.bb927780-lkp@intel.com> (raw)
Hello,
kernel test robot noticed a 6.2% regression of hackbench.throughput on:
commit: 2ae891b826958b60919ea21c727f77bcd6ffcc2c ("sched: Reduce the default slice to avoid tasks getting an extra tick")
https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git sched/core
[test failed on linux-next/master d4b0fd87ff0d4338b259dc79b2b3c6f7e70e8afa]
testcase: hackbench
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 128 threads 2 sockets Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz (Ice Lake) with 256G memory
parameters:
nr_threads: 100%
iterations: 4
mode: process
ipc: socket
cpufreq_governor: performance
In addition to that, the commit also has significant impact on the following tests:
+------------------+-------------------------------------------------------------------------------------------+
| testcase: change | stress-ng: stress-ng.membarrier.ops_per_sec 10.5% regression |
| test machine | 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory |
| test parameters | cpufreq_governor=performance |
| | nr_threads=100% |
| | test=membarrier |
| | testtime=60s |
+------------------+-------------------------------------------------------------------------------------------+
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202502251026.bb927780-lkp@intel.com
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250225/202502251026.bb927780-lkp@intel.com
=========================================================================================
compiler/cpufreq_governor/ipc/iterations/kconfig/mode/nr_threads/rootfs/tbox_group/testcase:
gcc-12/performance/socket/4/x86_64-rhel-9.4/process/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp2/hackbench
commit:
f553741ac8 ("sched: Cancel the slice protection of the idle entity")
2ae891b826 ("sched: Reduce the default slice to avoid tasks getting an extra tick")
f553741ac8c0e467 2ae891b826958b60919ea21c727
---------------- ---------------------------
%stddev %change %stddev
\ | \
5457 ± 6% +30.9% 7146 ± 11% perf-c2c.DRAM.remote
1156 ± 17% +76.3% 2038 ± 19% perf-c2c.HITM.remote
790654 ± 2% +22.8% 971104 sched_debug.cpu.nr_switches.avg
659209 ± 2% +24.6% 821703 ± 3% sched_debug.cpu.nr_switches.min
1706905 +20.0% 2047861 vmstat.system.cs
296017 +5.8% 313318 ± 2% vmstat.system.in
15076 ± 48% +121.3% 33360 ± 35% proc-vmstat.numa_pages_migrated
3389933 ± 5% +15.3% 3907919 ± 3% proc-vmstat.pgalloc_normal
2565152 ± 6% +27.9% 3280218 ± 5% proc-vmstat.pgfree
15076 ± 48% +121.3% 33360 ± 35% proc-vmstat.pgmigrate_success
781.28 ± 57% -100.0% 0.08 ±223% perf-sched.sch_delay.avg.ms.__cond_resched.__put_anon_vma.unlink_anon_vmas.free_pgtables.exit_mmap
3394 ± 51% -100.0% 0.08 ±223% perf-sched.sch_delay.max.ms.__cond_resched.__put_anon_vma.unlink_anon_vmas.free_pgtables.exit_mmap
0.18 ± 74% +3280.0% 6.22 ±125% perf-sched.sch_delay.max.ms.__cond_resched.task_work_run.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
42.40 ± 41% -62.7% 15.83 ± 60% perf-sched.wait_and_delay.count.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
86.80 ± 42% -89.4% 9.17 ± 97% perf-sched.wait_and_delay.count.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.unlink_anon_vmas
977.49 ± 51% -99.9% 0.95 ±223% perf-sched.wait_time.avg.ms.__cond_resched.__put_anon_vma.unlink_anon_vmas.free_pgtables.exit_mmap
3397 ± 50% -100.0% 0.95 ±223% perf-sched.wait_time.max.ms.__cond_resched.__put_anon_vma.unlink_anon_vmas.free_pgtables.exit_mmap
433157 -6.2% 406447 hackbench.throughput
423258 -6.9% 394005 hackbench.throughput_avg
433157 -6.2% 406447 hackbench.throughput_best
411374 -6.8% 383238 hackbench.throughput_worst
143.13 +7.3% 153.65 hackbench.time.elapsed_time
143.13 +7.3% 153.65 hackbench.time.elapsed_time.max
39754543 ± 3% +56.8% 62349308 hackbench.time.involuntary_context_switches
623881 +3.9% 648284 hackbench.time.minor_page_faults
17045 +7.7% 18350 hackbench.time.system_time
900.50 +2.5% 922.71 hackbench.time.user_time
2.019e+08 +23.3% 2.489e+08 hackbench.time.voluntary_context_switches
1.61 -2.3% 1.57 perf-stat.i.MPKI
4.411e+10 -5.0% 4.192e+10 perf-stat.i.branch-instructions
0.41 ± 2% +0.0 0.44 perf-stat.i.branch-miss-rate%
1.744e+08 +1.6% 1.772e+08 perf-stat.i.branch-misses
25.15 -0.6 24.50 perf-stat.i.cache-miss-rate%
3.5e+08 -7.0% 3.255e+08 perf-stat.i.cache-misses
1.398e+09 -3.8% 1.346e+09 perf-stat.i.cache-references
1677956 ± 2% +20.8% 2027400 perf-stat.i.context-switches
1.49 +5.6% 1.57 perf-stat.i.cpi
46084 ± 8% +44.6% 66621 ± 8% perf-stat.i.cpu-migrations
935.91 +8.3% 1013 perf-stat.i.cycles-between-cache-misses
2.175e+11 -5.1% 2.065e+11 perf-stat.i.instructions
0.68 -5.2% 0.64 perf-stat.i.ipc
13.38 ± 2% +21.7% 16.28 perf-stat.i.metric.K/sec
1.61 -2.0% 1.58 perf-stat.overall.MPKI
0.39 +0.0 0.42 perf-stat.overall.branch-miss-rate%
25.05 -0.8 24.23 perf-stat.overall.cache-miss-rate%
1.49 +5.5% 1.57 perf-stat.overall.cpi
926.46 +7.6% 996.92 perf-stat.overall.cycles-between-cache-misses
0.67 -5.2% 0.64 perf-stat.overall.ipc
4.382e+10 -5.0% 4.164e+10 perf-stat.ps.branch-instructions
1.73e+08 +1.5% 1.755e+08 perf-stat.ps.branch-misses
3.475e+08 -7.0% 3.233e+08 perf-stat.ps.cache-misses
1.387e+09 -3.8% 1.334e+09 perf-stat.ps.cache-references
1662988 ± 2% +20.6% 2004942 perf-stat.ps.context-switches
44600 ± 8% +43.7% 64072 ± 7% perf-stat.ps.cpu-migrations
2.161e+11 -5.1% 2.051e+11 perf-stat.ps.instructions
3.105e+13 +2.0% 3.169e+13 perf-stat.total.instructions
8.54 ± 2% -1.0 7.54 perf-profile.calltrace.cycles-pp.unix_stream_read_actor.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg.sock_read_iter
8.46 ± 2% -1.0 7.47 perf-profile.calltrace.cycles-pp.skb_copy_datagram_iter.unix_stream_read_actor.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg
8.30 ± 2% -1.0 7.31 perf-profile.calltrace.cycles-pp.__skb_datagram_iter.skb_copy_datagram_iter.unix_stream_read_actor.unix_stream_read_generic.unix_stream_recvmsg
4.38 ± 2% -0.6 3.81 ± 2% perf-profile.calltrace.cycles-pp._copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.unix_stream_read_actor.unix_stream_read_generic
3.20 ± 3% -0.3 2.85 perf-profile.calltrace.cycles-pp.simple_copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.unix_stream_read_actor.unix_stream_read_generic
3.00 ± 3% -0.3 2.67 perf-profile.calltrace.cycles-pp.__check_object_size.simple_copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.unix_stream_read_actor
3.40 ± 3% -0.3 3.10 ± 3% perf-profile.calltrace.cycles-pp.skb_release_head_state.consume_skb.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg
2.30 ± 3% -0.3 2.00 perf-profile.calltrace.cycles-pp.check_heap_object.__check_object_size.simple_copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter
3.25 ± 3% -0.3 2.97 ± 3% perf-profile.calltrace.cycles-pp.unix_destruct_scm.skb_release_head_state.consume_skb.unix_stream_read_generic.unix_stream_recvmsg
3.07 ± 2% -0.3 2.79 ± 2% perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg
3.05 ± 3% -0.3 2.79 ± 3% perf-profile.calltrace.cycles-pp.sock_wfree.unix_destruct_scm.skb_release_head_state.consume_skb.unix_stream_read_generic
2.50 ± 3% -0.2 2.29 ± 2% perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.__kmalloc_node_track_caller_noprof.kmalloc_reserve.__alloc_skb.alloc_skb_with_frags
2.18 ± 3% -0.2 1.99 perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_node_noprof.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb
1.99 ± 3% -0.2 1.82 perf-profile.calltrace.cycles-pp.clear_bhb_loop.write
1.95 ± 4% -0.2 1.78 ± 2% perf-profile.calltrace.cycles-pp.clear_bhb_loop.read
2.68 ± 3% -0.1 2.54 ± 2% perf-profile.calltrace.cycles-pp.skb_copy_datagram_from_iter.unix_stream_sendmsg.sock_write_iter.vfs_write.ksys_write
1.55 ± 3% -0.1 1.42 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.read
1.55 ± 3% -0.1 1.42 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.write
1.35 ± 3% -0.1 1.24 ± 3% perf-profile.calltrace.cycles-pp.__slab_free.kfree.skb_release_data.consume_skb.unix_stream_read_generic
1.04 ± 3% -0.1 0.96 ± 3% perf-profile.calltrace.cycles-pp.__slab_free.kmem_cache_free.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg
1.12 ± 3% -0.1 1.04 perf-profile.calltrace.cycles-pp.__check_object_size.skb_copy_datagram_from_iter.unix_stream_sendmsg.sock_write_iter.vfs_write
0.62 ± 4% -0.1 0.56 perf-profile.calltrace.cycles-pp.__build_skb_around.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb.unix_stream_sendmsg
0.72 ± 3% -0.1 0.66 perf-profile.calltrace.cycles-pp.mod_objcg_state.__memcg_slab_free_hook.kmem_cache_free.unix_stream_read_generic.unix_stream_recvmsg
0.63 ± 2% -0.1 0.57 ± 3% perf-profile.calltrace.cycles-pp.skb_unlink.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg.sock_read_iter
0.57 ± 3% -0.0 0.52 ± 2% perf-profile.calltrace.cycles-pp.check_heap_object.__check_object_size.skb_copy_datagram_from_iter.unix_stream_sendmsg.sock_write_iter
1.17 ± 3% +0.2 1.32 ± 6% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
0.42 ± 50% +0.3 0.76 ± 22% perf-profile.calltrace.cycles-pp.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_sync_key
1.36 ± 3% +0.5 1.88 ± 21% perf-profile.calltrace.cycles-pp.__schedule.schedule.schedule_timeout.unix_stream_data_wait.unix_stream_read_generic
1.38 ± 3% +0.5 1.91 ± 21% perf-profile.calltrace.cycles-pp.schedule.schedule_timeout.unix_stream_data_wait.unix_stream_read_generic.unix_stream_recvmsg
1.43 ± 3% +0.5 1.98 ± 21% perf-profile.calltrace.cycles-pp.schedule_timeout.unix_stream_data_wait.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg
1.63 ± 3% +0.7 2.28 ± 21% perf-profile.calltrace.cycles-pp.unix_stream_data_wait.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg.sock_read_iter
36.49 +0.8 37.34 perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
35.51 +0.9 36.43 perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
38.59 +0.9 39.52 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.write
38.32 +1.0 39.27 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
34.44 +1.0 35.42 perf-profile.calltrace.cycles-pp.sock_write_iter.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
33.04 +1.1 34.12 perf-profile.calltrace.cycles-pp.unix_stream_sendmsg.sock_write_iter.vfs_write.ksys_write.do_syscall_64
8.58 ± 2% -1.0 7.58 perf-profile.children.cycles-pp.unix_stream_read_actor
8.35 ± 2% -1.0 7.36 perf-profile.children.cycles-pp.__skb_datagram_iter
8.50 ± 2% -1.0 7.51 perf-profile.children.cycles-pp.skb_copy_datagram_iter
4.40 ± 2% -0.6 3.83 ± 2% perf-profile.children.cycles-pp._copy_to_iter
5.77 ± 2% -0.4 5.32 ± 3% perf-profile.children.cycles-pp.__memcg_slab_free_hook
4.41 ± 3% -0.4 3.98 perf-profile.children.cycles-pp.__check_object_size
4.80 ± 3% -0.4 4.40 perf-profile.children.cycles-pp.__memcg_slab_post_alloc_hook
3.24 ± 3% -0.4 2.89 perf-profile.children.cycles-pp.simple_copy_to_iter
2.98 ± 3% -0.3 2.64 perf-profile.children.cycles-pp.check_heap_object
3.98 ± 3% -0.3 3.64 perf-profile.children.cycles-pp.clear_bhb_loop
3.44 ± 2% -0.3 3.14 ± 3% perf-profile.children.cycles-pp.skb_release_head_state
3.31 ± 2% -0.3 3.03 ± 3% perf-profile.children.cycles-pp.unix_destruct_scm
3.09 ± 3% -0.3 2.82 ± 3% perf-profile.children.cycles-pp.sock_wfree
2.42 ± 3% -0.2 2.23 ± 3% perf-profile.children.cycles-pp.__slab_free
2.59 ± 2% -0.2 2.42 ± 2% perf-profile.children.cycles-pp.mod_objcg_state
1.78 ± 3% -0.2 1.62 perf-profile.children.cycles-pp.entry_SYSCALL_64
2.76 ± 3% -0.1 2.61 ± 2% perf-profile.children.cycles-pp.skb_copy_datagram_from_iter
1.38 ± 3% -0.1 1.25 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
1.30 ± 4% -0.1 1.19 perf-profile.children.cycles-pp.obj_cgroup_charge
0.65 ± 4% -0.1 0.57 perf-profile.children.cycles-pp.__build_skb_around
0.66 ± 3% -0.1 0.61 perf-profile.children.cycles-pp.refill_obj_stock
0.73 ± 3% -0.1 0.68 perf-profile.children.cycles-pp.__check_heap_object
0.59 ± 3% -0.1 0.54 ± 2% perf-profile.children.cycles-pp.rw_verify_area
0.66 ± 2% -0.1 0.61 ± 3% perf-profile.children.cycles-pp.skb_unlink
0.55 ± 4% -0.0 0.51 ± 2% perf-profile.children.cycles-pp.__virt_addr_valid
0.28 ± 3% -0.0 0.26 perf-profile.children.cycles-pp.__scm_recv_common
0.16 ± 4% -0.0 0.14 ± 3% perf-profile.children.cycles-pp.is_vmalloc_addr
0.16 ± 3% -0.0 0.14 ± 2% perf-profile.children.cycles-pp.security_socket_recvmsg
0.17 ± 2% -0.0 0.16 perf-profile.children.cycles-pp.put_pid
0.14 ± 3% -0.0 0.12 ± 3% perf-profile.children.cycles-pp.manage_oob
0.11 -0.0 0.10 perf-profile.children.cycles-pp.wait_for_unix_gc
0.06 ± 6% +0.0 0.08 ± 11% perf-profile.children.cycles-pp.os_xsave
0.20 ± 3% +0.0 0.23 ± 7% perf-profile.children.cycles-pp.__get_user_8
0.06 ± 6% +0.0 0.09 ± 17% perf-profile.children.cycles-pp.sched_clock
0.06 ± 6% +0.0 0.09 ± 14% perf-profile.children.cycles-pp.check_preempt_wakeup_fair
0.09 ± 5% +0.0 0.13 ± 18% perf-profile.children.cycles-pp.__switch_to
0.08 ± 4% +0.0 0.12 ± 21% perf-profile.children.cycles-pp.__update_load_avg_cfs_rq
0.15 ± 6% +0.0 0.20 ± 10% perf-profile.children.cycles-pp.__dequeue_entity
0.25 ± 3% +0.0 0.29 ± 9% perf-profile.children.cycles-pp.rseq_ip_fixup
0.09 ± 10% +0.0 0.14 ± 15% perf-profile.children.cycles-pp.pick_eevdf
0.13 ± 7% +0.0 0.18 ± 14% perf-profile.children.cycles-pp.__enqueue_entity
0.08 ± 10% +0.0 0.12 ± 27% perf-profile.children.cycles-pp.wakeup_preempt
0.01 ±200% +0.1 0.06 ± 11% perf-profile.children.cycles-pp.vruntime_eligible
0.01 ±200% +0.1 0.07 ± 23% perf-profile.children.cycles-pp.___perf_sw_event
0.01 ±200% +0.1 0.08 ± 27% perf-profile.children.cycles-pp.put_prev_entity
0.31 ± 2% +0.1 0.38 ± 12% perf-profile.children.cycles-pp.__rseq_handle_notify_resume
0.22 ± 7% +0.1 0.30 ± 12% perf-profile.children.cycles-pp.set_next_entity
0.14 ± 5% +0.1 0.22 ± 22% perf-profile.children.cycles-pp.pick_task_fair
0.14 ± 44% +0.1 0.24 ± 15% perf-profile.children.cycles-pp.get_any_partial
0.27 ± 5% +0.1 0.37 ± 15% perf-profile.children.cycles-pp.switch_mm_irqs_off
0.33 ± 4% +0.1 0.47 ± 22% perf-profile.children.cycles-pp.enqueue_entity
0.30 ± 4% +0.2 0.46 ± 26% perf-profile.children.cycles-pp.update_load_avg
0.48 ± 4% +0.2 0.72 ± 26% perf-profile.children.cycles-pp.enqueue_task_fair
0.51 ± 3% +0.2 0.75 ± 27% perf-profile.children.cycles-pp.enqueue_task
0.48 ± 6% +0.3 0.75 ± 23% perf-profile.children.cycles-pp.pick_next_task_fair
0.49 ± 6% +0.3 0.76 ± 24% perf-profile.children.cycles-pp.__pick_next_task
0.60 ± 4% +0.3 0.89 ± 25% perf-profile.children.cycles-pp.ttwu_do_activate
1.67 ± 2% +0.6 2.23 ± 20% perf-profile.children.cycles-pp.schedule_timeout
1.64 ± 3% +0.7 2.29 ± 21% perf-profile.children.cycles-pp.unix_stream_data_wait
1.78 ± 4% +0.7 2.53 ± 21% perf-profile.children.cycles-pp.schedule
1.78 ± 4% +0.8 2.54 ± 22% perf-profile.children.cycles-pp.__schedule
36.58 +0.8 37.42 perf-profile.children.cycles-pp.ksys_write
35.60 +0.9 36.51 perf-profile.children.cycles-pp.vfs_write
34.52 +1.0 35.49 perf-profile.children.cycles-pp.sock_write_iter
33.31 +1.0 34.36 perf-profile.children.cycles-pp.unix_stream_sendmsg
4.37 ± 2% -0.6 3.79 ± 2% perf-profile.self.cycles-pp._copy_to_iter
3.94 ± 3% -0.3 3.60 perf-profile.self.cycles-pp.clear_bhb_loop
2.27 ± 3% -0.3 1.98 perf-profile.self.cycles-pp.check_heap_object
3.29 ± 2% -0.3 3.01 ± 5% perf-profile.self.cycles-pp.__memcg_slab_free_hook
2.03 ± 4% -0.3 1.76 ± 2% perf-profile.self.cycles-pp.kmem_cache_free
2.50 ± 3% -0.2 2.25 ± 3% perf-profile.self.cycles-pp.sock_wfree
2.61 ± 2% -0.2 2.37 ± 3% perf-profile.self.cycles-pp.unix_stream_read_generic
2.30 ± 3% -0.2 2.09 perf-profile.self.cycles-pp.__memcg_slab_post_alloc_hook
2.37 ± 3% -0.2 2.18 ± 3% perf-profile.self.cycles-pp.__slab_free
1.04 ± 4% -0.2 0.86 ± 5% perf-profile.self.cycles-pp.skb_release_data
2.19 ± 4% -0.2 2.01 ± 2% perf-profile.self.cycles-pp.mod_objcg_state
1.31 ± 3% -0.1 1.18 perf-profile.self.cycles-pp.__kmalloc_node_track_caller_noprof
1.33 ± 3% -0.1 1.21 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
1.04 ± 3% -0.1 0.93 perf-profile.self.cycles-pp.kmem_cache_alloc_node_noprof
1.13 ± 3% -0.1 1.02 perf-profile.self.cycles-pp.__alloc_skb
1.38 ± 2% -0.1 1.29 ± 2% perf-profile.self.cycles-pp.syscall_exit_to_user_mode
0.74 ± 3% -0.1 0.66 perf-profile.self.cycles-pp.__skb_datagram_iter
1.11 ± 3% -0.1 1.03 ± 2% perf-profile.self.cycles-pp.sock_write_iter
0.80 ± 3% -0.1 0.74 ± 2% perf-profile.self.cycles-pp.write
0.60 ± 4% -0.1 0.54 perf-profile.self.cycles-pp.__build_skb_around
0.84 ± 4% -0.1 0.78 perf-profile.self.cycles-pp.sock_read_iter
0.69 ± 3% -0.1 0.64 perf-profile.self.cycles-pp.__check_heap_object
0.62 ± 3% -0.1 0.57 perf-profile.self.cycles-pp.refill_obj_stock
0.82 -0.0 0.77 perf-profile.self.cycles-pp.read
0.80 ± 3% -0.0 0.75 ± 3% perf-profile.self.cycles-pp.do_syscall_64
0.51 ± 4% -0.0 0.47 perf-profile.self.cycles-pp.__virt_addr_valid
0.46 ± 2% -0.0 0.43 ± 2% perf-profile.self.cycles-pp.kfree
0.59 ± 3% -0.0 0.56 ± 2% perf-profile.self.cycles-pp.__check_object_size
0.44 ± 2% -0.0 0.41 perf-profile.self.cycles-pp.entry_SYSCALL_64
0.36 ± 3% -0.0 0.32 ± 2% perf-profile.self.cycles-pp.rw_verify_area
0.43 ± 2% -0.0 0.40 ± 2% perf-profile.self.cycles-pp.unix_write_space
0.37 ± 4% -0.0 0.34 ± 2% perf-profile.self.cycles-pp.x64_sys_call
0.34 ± 3% -0.0 0.31 ± 2% perf-profile.self.cycles-pp.__cond_resched
0.29 ± 3% -0.0 0.27 perf-profile.self.cycles-pp.ksys_write
0.30 ± 2% -0.0 0.28 ± 2% perf-profile.self.cycles-pp.skb_copy_datagram_from_iter
0.18 ± 4% -0.0 0.16 ± 3% perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
0.18 ± 2% -0.0 0.17 ± 2% perf-profile.self.cycles-pp.unix_destruct_scm
0.21 ± 3% -0.0 0.19 perf-profile.self.cycles-pp.__scm_recv_common
0.25 -0.0 0.23 ± 2% perf-profile.self.cycles-pp.kmalloc_reserve
0.15 ± 2% -0.0 0.14 perf-profile.self.cycles-pp.skb_unlink
0.15 ± 2% -0.0 0.14 perf-profile.self.cycles-pp.unix_scm_to_skb
0.07 ± 9% +0.0 0.10 ± 19% perf-profile.self.cycles-pp.pick_eevdf
0.09 ± 5% +0.0 0.13 ± 16% perf-profile.self.cycles-pp.__switch_to
0.11 ± 7% +0.0 0.14 ± 10% perf-profile.self.cycles-pp.__dequeue_entity
0.08 ± 5% +0.0 0.12 ± 22% perf-profile.self.cycles-pp.__update_load_avg_cfs_rq
0.02 ±122% +0.0 0.06 ± 17% perf-profile.self.cycles-pp.native_sched_clock
0.13 ± 8% +0.1 0.18 ± 14% perf-profile.self.cycles-pp.__enqueue_entity
0.00 +0.1 0.06 ± 9% perf-profile.self.cycles-pp.vruntime_eligible
0.27 ± 5% +0.1 0.37 ± 15% perf-profile.self.cycles-pp.switch_mm_irqs_off
***************************************************************************************************
lkp-icl-2sp7: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-12/performance/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/membarrier/stress-ng/60s
commit:
f553741ac8 ("sched: Cancel the slice protection of the idle entity")
2ae891b826 ("sched: Reduce the default slice to avoid tasks getting an extra tick")
f553741ac8c0e467 2ae891b826958b60919ea21c727
---------------- ---------------------------
%stddev %change %stddev
\ | \
1.08 -0.1 0.99 mpstat.cpu.all.irq%
67.18 ± 2% -11.9% 59.20 ± 5% mpstat.max_utilization_pct
3401 ± 19% -31.4% 2332 ± 18% perf-c2c.DRAM.remote
2396 ± 3% -23.1% 1844 ± 18% perf-c2c.HITM.remote
29248 +14.3% 33418 vmstat.system.cs
788485 -9.1% 716631 vmstat.system.in
191106 -1.7% 187946 proc-vmstat.nr_anon_pages
535277 ± 2% +5.6% 565009 ± 4% proc-vmstat.numa_hit
469052 ± 2% +6.3% 498763 ± 5% proc-vmstat.numa_local
51285 ± 7% +54.3% 79119 ± 31% proc-vmstat.numa_pages_migrated
51285 ± 7% +54.3% 79119 ± 31% proc-vmstat.pgmigrate_success
16417 ± 7% +131.4% 37986 ± 78% proc-vmstat.pgreuse
505.28 -10.6% 451.92 stress-ng.membarrier.membarrier_calls_per_sec
97160 -10.5% 86939 stress-ng.membarrier.ops
1618 -10.5% 1448 stress-ng.membarrier.ops_per_sec
55094 ± 5% +277.5% 207976 ± 9% stress-ng.time.involuntary_context_switches
3195 ± 2% -8.3% 2931 stress-ng.time.percent_of_cpu_this_job_got
1921 ± 2% -8.3% 1761 stress-ng.time.system_time
1047923 +5.9% 1109900 stress-ng.time.voluntary_context_switches
5.501e+09 ± 2% -7.8% 5.074e+09 perf-stat.i.branch-instructions
30090 +14.4% 34431 perf-stat.i.context-switches
1.041e+11 ± 2% -7.6% 9.627e+10 perf-stat.i.cpu-cycles
10683 +6.7% 11402 perf-stat.i.cpu-migrations
2.73e+10 ± 2% -7.6% 2.522e+10 perf-stat.i.instructions
5.406e+09 ± 2% -7.8% 4.985e+09 perf-stat.ps.branch-instructions
29571 +14.4% 33836 perf-stat.ps.context-switches
1.024e+11 ± 2% -7.6% 9.46e+10 perf-stat.ps.cpu-cycles
10498 +6.7% 11203 perf-stat.ps.cpu-migrations
2.683e+10 ± 2% -7.6% 2.478e+10 perf-stat.ps.instructions
1.631e+12 ± 2% -7.7% 1.505e+12 perf-stat.total.instructions
698086 ± 4% -12.0% 614339 ± 3% sched_debug.cfs_rq:/.avg_vruntime.avg
918198 ± 7% -13.5% 794083 ± 6% sched_debug.cfs_rq:/.avg_vruntime.max
650282 ± 4% -12.9% 566525 ± 4% sched_debug.cfs_rq:/.avg_vruntime.min
698086 ± 4% -12.0% 614339 ± 3% sched_debug.cfs_rq:/.min_vruntime.avg
918198 ± 7% -13.5% 794083 ± 6% sched_debug.cfs_rq:/.min_vruntime.max
650282 ± 4% -12.9% 566525 ± 4% sched_debug.cfs_rq:/.min_vruntime.min
13.48 ± 36% +250.6% 47.25 ± 40% sched_debug.cfs_rq:/.removed.load_avg.avg
77.26 ± 17% +91.9% 148.27 ± 24% sched_debug.cfs_rq:/.removed.load_avg.stddev
5.08 ± 33% +246.5% 17.60 ± 35% sched_debug.cfs_rq:/.removed.runnable_avg.avg
212.33 ± 20% +30.1% 276.17 ± 7% sched_debug.cfs_rq:/.removed.runnable_avg.max
30.44 ± 21% +89.0% 57.52 ± 14% sched_debug.cfs_rq:/.removed.runnable_avg.stddev
5.08 ± 33% +246.6% 17.60 ± 35% sched_debug.cfs_rq:/.removed.util_avg.avg
212.25 ± 21% +30.1% 276.08 ± 7% sched_debug.cfs_rq:/.removed.util_avg.max
30.43 ± 21% +89.0% 57.51 ± 14% sched_debug.cfs_rq:/.removed.util_avg.stddev
15701 +12.8% 17719 sched_debug.cpu.nr_switches.avg
11778 ± 7% +20.3% 14165 ± 8% sched_debug.cpu.nr_switches.min
-202.17 +21.0% -244.58 sched_debug.cpu.nr_uninterruptible.min
1.43 ± 36% -99.6% 0.01 ±223% perf-sched.sch_delay.avg.ms.__cond_resched.__wait_for_common.stop_two_cpus.migrate_swap.task_numa_migrate
0.94 ± 23% -91.9% 0.08 ±223% perf-sched.sch_delay.avg.ms.__cond_resched.stop_one_cpu.migrate_task_to.task_numa_migrate.isra
1.60 ± 68% -99.9% 0.00 ±223% perf-sched.sch_delay.avg.ms.io_schedule.migration_entry_wait_on_locked.__handle_mm_fault.handle_mm_fault
1.39 ± 8% +71.7% 2.38 ± 7% perf-sched.sch_delay.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.membarrier_global_expedited
1.95 ± 5% +23.7% 2.41 ± 5% perf-sched.sch_delay.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.membarrier_private_expedited
0.89 ± 4% -16.0% 0.75 ± 3% perf-sched.sch_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.__wait_rcu_gp
0.01 ± 25% +75.0% 0.02 ± 34% perf-sched.sch_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
0.06 ± 11% -37.5% 0.04 ± 40% perf-sched.sch_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
0.80 ±145% +478.8% 4.62 ± 52% perf-sched.sch_delay.max.ms.__cond_resched.__mutex_lock.constprop.0.membarrier_private_expedited
5.29 ± 41% -99.9% 0.01 ±223% perf-sched.sch_delay.max.ms.__cond_resched.__wait_for_common.stop_two_cpus.migrate_swap.task_numa_migrate
6.37 ± 13% -93.7% 0.40 ±223% perf-sched.sch_delay.max.ms.__cond_resched.stop_one_cpu.migrate_task_to.task_numa_migrate.isra
2.22 ± 49% -99.9% 0.00 ±223% perf-sched.sch_delay.max.ms.io_schedule.migration_entry_wait_on_locked.__handle_mm_fault.handle_mm_fault
10.40 ± 13% +32.1% 13.74 ± 5% perf-sched.sch_delay.max.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.membarrier_global_expedited
4.55 ± 5% -34.9% 2.96 ± 42% perf-sched.sch_delay.max.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
0.98 ± 4% +33.4% 1.30 ± 6% perf-sched.total_sch_delay.average.ms
22.34 -12.3% 19.59 perf-sched.total_wait_and_delay.average.ms
102076 +18.6% 121096 perf-sched.total_wait_and_delay.count.ms
21.37 -14.4% 18.29 perf-sched.total_wait_time.average.ms
515.07 ± 36% +63.2% 840.46 ± 16% perf-sched.wait_and_delay.avg.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
11.25 ± 5% +56.4% 17.59 ± 7% perf-sched.wait_and_delay.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.membarrier_global_expedited
15.80 -13.5% 13.67 perf-sched.wait_and_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.__wait_rcu_gp
487.31 ± 4% +16.4% 567.38 ± 2% perf-sched.wait_and_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
8.00 ± 26% +95.8% 15.67 ± 20% perf-sched.wait_and_delay.count.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
1384 ± 12% +58.1% 2188 ± 8% perf-sched.wait_and_delay.count.schedule_preempt_disabled.__mutex_lock.constprop.0.membarrier_global_expedited
10678 ± 7% +270.1% 39521 ± 4% perf-sched.wait_and_delay.count.schedule_preempt_disabled.__mutex_lock.constprop.0.membarrier_private_expedited
85629 -12.4% 75039 ± 3% perf-sched.wait_and_delay.count.schedule_timeout.__wait_for_common.wait_for_completion_state.__wait_rcu_gp
2443 ± 44% -58.3% 1018 perf-sched.wait_and_delay.max.ms.pipe_read.vfs_read.ksys_read.do_syscall_64
2099 ± 55% -76.1% 501.21 perf-sched.wait_and_delay.max.ms.schedule_hrtimeout_range.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
15.94 ± 9% -86.6% 2.13 ±223% perf-sched.wait_time.avg.ms.__cond_resched.__wait_for_common.stop_two_cpus.migrate_swap.task_numa_migrate
515.06 ± 36% +63.2% 840.45 ± 16% perf-sched.wait_time.avg.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
15.25 ± 3% -85.0% 2.29 ±223% perf-sched.wait_time.avg.ms.__cond_resched.stop_one_cpu.migrate_task_to.task_numa_migrate.isra
427.24 ± 78% -99.6% 1.55 ±107% perf-sched.wait_time.avg.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
10.38 ± 53% -95.2% 0.50 ±223% perf-sched.wait_time.avg.ms.io_schedule.migration_entry_wait_on_locked.__handle_mm_fault.handle_mm_fault
48.58 ±185% -94.2% 2.80 ± 99% perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
9.86 ± 5% +54.2% 15.21 ± 7% perf-sched.wait_time.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.membarrier_global_expedited
14.92 -13.4% 12.92 perf-sched.wait_time.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.__wait_rcu_gp
1.30 ± 8% -11.1% 1.15 ± 6% perf-sched.wait_time.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
487.30 ± 4% +16.4% 567.36 ± 2% perf-sched.wait_time.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
6.13 ±141% +268.5% 22.60 ± 17% perf-sched.wait_time.max.ms.__cond_resched.__mutex_lock.constprop.0.membarrier_private_expedited
25.13 ± 9% -91.5% 2.13 ±223% perf-sched.wait_time.max.ms.__cond_resched.__wait_for_common.stop_two_cpus.migrate_swap.task_numa_migrate
25.92 ± 12% -86.1% 3.61 ±223% perf-sched.wait_time.max.ms.__cond_resched.stop_one_cpu.migrate_task_to.task_numa_migrate.isra
2260 ± 59% -99.9% 3.00 ±118% perf-sched.wait_time.max.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
13.02 ± 43% -96.2% 0.50 ±223% perf-sched.wait_time.max.ms.io_schedule.migration_entry_wait_on_locked.__handle_mm_fault.handle_mm_fault
2443 ± 44% -58.3% 1018 perf-sched.wait_time.max.ms.pipe_read.vfs_read.ksys_read.do_syscall_64
2097 ± 55% -76.1% 500.54 perf-sched.wait_time.max.ms.schedule_hrtimeout_range.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
next reply other threads:[~2025-02-25 2:32 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-25 2:32 kernel test robot [this message]
2025-02-25 9:31 ` [tip:sched/core] [sched] 2ae891b826: hackbench.throughput 6.2% regression Chen Yu
2025-02-25 9:45 ` Vincent Guittot
2025-02-25 10:15 ` Chen Yu
2025-02-25 12:27 ` Peter Zijlstra
2025-02-25 13:15 ` Chen Yu
2025-02-25 13:42 ` Qais Yousef
2025-02-25 15:35 ` Chen Yu
2025-02-25 23:10 ` Qais Yousef
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=202502251026.bb927780-lkp@intel.com \
--to=oliver.sang@intel.com \
--cc=15645113830zzh@gmail.com \
--cc=aubrey.li@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lkp@intel.com \
--cc=oe-lkp@lists.linux.dev \
--cc=peterz@infradead.org \
--cc=vincent.guittot@linaro.org \
--cc=x86@kernel.org \
--cc=yu.c.chen@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.