* [linux-next:master] [pipe_read] aaec5a95d5: stress-ng.poll.ops_per_sec 11.1% regression
@ 2025-01-20 6:57 kernel test robot
2025-01-20 11:27 ` Mateusz Guzik
2025-01-20 15:50 ` Oleg Nesterov
0 siblings, 2 replies; 11+ messages in thread
From: kernel test robot @ 2025-01-20 6:57 UTC (permalink / raw)
To: Oleg Nesterov
Cc: oe-lkp, lkp, Christian Brauner, WangYuli, linux-fsdevel,
oliver.sang
hi, Oleg Nesterov,
we reported
"[brauner-vfs:vfs-6.14.misc] [pipe_read] aaec5a95d5: hackbench.throughput 7.5% regression"
in
https://lore.kernel.org/all/202501101015.90874b3a-lkp@intel.com/
but seems both you and Christian Brauner think it could be ignored.
now we captured a regression in another test case. since e.g. there are
something like below in perf data, not sure if it could supply any useful
information? just FYI. sorry if it's still not with big value.
9.45 -6.3 3.13 ± 9% perf-profile.calltrace.cycles-pp.pipe_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
...
10.00 -6.5 3.53 ± 9% perf-profile.children.cycles-pp.pipe_read
2.34 -1.3 1.07 ± 9% perf-profile.children.cycles-pp.pipe_poll
Hello,
kernel test robot noticed a 11.1% regression of stress-ng.poll.ops_per_sec on:
commit: aaec5a95d59615523db03dd53c2052f0a87beea7 ("pipe_read: don't wake up the writer if the pipe is still full")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
[test failed on linux-next/master b323d8e7bc03d27dec646bfdccb7d1a92411f189]
testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory
parameters:
nr_threads: 100%
testtime: 60s
test: poll
cpufreq_governor: performance
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202501201311.6d25a0b9-lkp@intel.com
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250120/202501201311.6d25a0b9-lkp@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-12/performance/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-spr-r02/poll/stress-ng/60s
commit:
d2fc0ed52a ("Merge branch 'vfs-6.14.uncached_buffered_io'")
aaec5a95d5 ("pipe_read: don't wake up the writer if the pipe is still full")
d2fc0ed52a284a13 aaec5a95d59615523db03dd53c2
---------------- ---------------------------
%stddev %change %stddev
\ | \
4.049e+08 -54.4% 1.847e+08 cpuidle..usage
70.66 ± 2% -40.9% 41.74 ± 3% vmstat.procs.r
13673771 -55.5% 6089704 vmstat.system.cs
3388831 -9.5% 3067987 vmstat.system.in
6.62 +12.4 19.04 ± 6% mpstat.cpu.all.irq%
0.07 -0.0 0.06 mpstat.cpu.all.soft%
17.51 -6.8 10.76 mpstat.cpu.all.sys%
4.98 -1.0 4.01 ± 2% mpstat.cpu.all.usr%
18.00 ± 4% +110.2% 37.83 ± 37% mpstat.max_utilization.seconds
3483 ± 17% -38.5% 2141 ± 9% perf-c2c.DRAM.local
2266 ± 5% +256.6% 8081 ± 5% perf-c2c.DRAM.remote
173996 ± 2% -50.8% 85520 ± 12% perf-c2c.HITM.local
1097 ± 10% +97.5% 2168 ± 5% perf-c2c.HITM.remote
175093 ± 2% -49.9% 87689 ± 11% perf-c2c.HITM.total
2643650 +5.6% 2790795 ± 2% proc-vmstat.nr_active_anon
3308720 +4.6% 3459916 proc-vmstat.nr_file_pages
2427405 +6.2% 2578600 ± 2% proc-vmstat.nr_shmem
2643650 +5.6% 2790795 ± 2% proc-vmstat.nr_zone_active_anon
235308 -19.3% 189819 ± 12% proc-vmstat.numa_hint_faults_local
1437439 -5.5% 1358004 ± 3% proc-vmstat.pgfault
8.71e+08 -11.1% 7.745e+08 stress-ng.poll.ops
14516970 -11.1% 12907569 stress-ng.poll.ops_per_sec
181583 -57.1% 77818 ± 21% stress-ng.time.involuntary_context_switches
85474 +1.6% 86823 stress-ng.time.minor_page_faults
6150 -47.8% 3208 stress-ng.time.percent_of_cpu_this_job_got
2993 -50.6% 1477 stress-ng.time.system_time
711.20 -36.0% 454.85 stress-ng.time.user_time
4.427e+08 -56.2% 1.937e+08 stress-ng.time.voluntary_context_switches
834292 ± 4% -60.5% 329635 ± 12% sched_debug.cfs_rq:/.avg_vruntime.avg
520206 ± 5% -70.0% 155956 ± 44% sched_debug.cfs_rq:/.avg_vruntime.min
80954 ± 25% -78.0% 17846 ± 55% sched_debug.cfs_rq:/.left_deadline.avg
312463 ± 46% -69.5% 95397 ± 67% sched_debug.cfs_rq:/.left_deadline.stddev
80943 ± 25% -78.0% 17842 ± 55% sched_debug.cfs_rq:/.left_vruntime.avg
312436 ± 46% -69.5% 95382 ± 67% sched_debug.cfs_rq:/.left_vruntime.stddev
834292 ± 4% -60.5% 329635 ± 12% sched_debug.cfs_rq:/.min_vruntime.avg
520206 ± 5% -70.0% 155956 ± 44% sched_debug.cfs_rq:/.min_vruntime.min
80943 ± 25% -78.0% 17842 ± 55% sched_debug.cfs_rq:/.right_vruntime.avg
312436 ± 46% -69.5% 95382 ± 67% sched_debug.cfs_rq:/.right_vruntime.stddev
224.99 ± 4% -32.0% 152.91 ± 9% sched_debug.cfs_rq:/.runnable_avg.avg
212.19 ± 4% -30.4% 147.67 ± 8% sched_debug.cfs_rq:/.util_avg.avg
28.51 ± 5% -26.3% 21.02 ± 29% sched_debug.cfs_rq:/.util_est.avg
0.18 ± 6% -41.8% 0.11 ± 30% sched_debug.cpu.nr_running.avg
0.38 ± 6% -23.2% 0.29 ± 14% sched_debug.cpu.nr_running.stddev
1893149 -63.0% 700843 ± 44% sched_debug.cpu.nr_switches.avg
2007234 -62.4% 755101 ± 43% sched_debug.cpu.nr_switches.max
845934 ± 16% -65.9% 288868 ± 45% sched_debug.cpu.nr_switches.min
136058 ± 6% -62.3% 51280 ± 45% sched_debug.cpu.nr_switches.stddev
54.75 ± 29% -35.9% 35.08 ± 21% sched_debug.cpu.nr_uninterruptible.max
0.13 ± 3% +138.2% 0.30 ± 3% perf-stat.i.MPKI
4.279e+10 -24.6% 3.224e+10 perf-stat.i.branch-instructions
0.57 -0.1 0.47 ± 2% perf-stat.i.branch-miss-rate%
2.343e+08 -36.3% 1.493e+08 ± 2% perf-stat.i.branch-misses
3.22 ± 2% +7.6 10.77 ± 2% perf-stat.i.cache-miss-rate%
29560239 ± 2% +71.2% 50594050 ± 4% perf-stat.i.cache-misses
8.946e+08 -49.8% 4.494e+08 ± 2% perf-stat.i.cache-references
14167388 -55.5% 6301671 perf-stat.i.context-switches
1.33 +45.2% 1.93 ± 3% perf-stat.i.cpi
2.778e+11 +8.5% 3.013e+11 ± 2% perf-stat.i.cpu-cycles
2501362 -83.7% 408632 perf-stat.i.cpu-migrations
15733 ± 3% -52.7% 7444 ± 7% perf-stat.i.cycles-between-cache-misses
2.138e+11 -25.5% 1.594e+11 perf-stat.i.instructions
0.76 -29.3% 0.54 ± 2% perf-stat.i.ipc
74.48 -59.7% 29.98 perf-stat.i.metric.K/sec
21318 -6.8% 19869 ± 6% perf-stat.i.minor-faults
21318 -6.8% 19869 ± 6% perf-stat.i.page-faults
0.14 ± 2% +130.0% 0.32 ± 4% perf-stat.overall.MPKI
0.55 -0.1 0.46 ± 2% perf-stat.overall.branch-miss-rate%
3.30 +7.9 11.24 ± 2% perf-stat.overall.cache-miss-rate%
1.30 +45.5% 1.89 ± 3% perf-stat.overall.cpi
9426 ± 2% -36.7% 5969 ± 5% perf-stat.overall.cycles-between-cache-misses
0.77 -31.2% 0.53 ± 3% perf-stat.overall.ipc
4.206e+10 -24.7% 3.167e+10 perf-stat.ps.branch-instructions
2.299e+08 -36.3% 1.464e+08 ± 2% perf-stat.ps.branch-misses
28994632 ± 2% +71.3% 49675855 ± 4% perf-stat.ps.cache-misses
8.795e+08 -49.8% 4.419e+08 ± 2% perf-stat.ps.cache-references
13938322 -55.5% 6196929 perf-stat.ps.context-switches
2.732e+11 +8.3% 2.96e+11 ± 2% perf-stat.ps.cpu-cycles
2460505 -83.7% 402100 perf-stat.ps.cpu-migrations
2.102e+11 -25.5% 1.565e+11 perf-stat.ps.instructions
0.01 ± 82% +259.6% 0.05 ± 57% perf-stat.ps.major-faults
20785 -7.2% 19284 ± 6% perf-stat.ps.minor-faults
20785 -7.2% 19284 ± 6% perf-stat.ps.page-faults
1.283e+13 -25.5% 9.55e+12 perf-stat.total.instructions
40.62 -19.6 21.06 ± 10% perf-profile.calltrace.cycles-pp.stress_run
15.66 -8.0 7.64 ± 9% perf-profile.calltrace.cycles-pp.stress_poll.stress_run
12.84 -7.7 5.14 ± 16% perf-profile.calltrace.cycles-pp.write.stress_run
12.48 -7.5 4.98 ± 10% perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
13.32 -7.3 6.04 ± 9% perf-profile.calltrace.cycles-pp.read.stress_poll.stress_run
11.55 -7.1 4.46 ± 17% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.write.stress_run
11.38 -7.0 4.33 ± 16% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.write.stress_run
12.17 -7.0 5.16 ± 9% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.read.stress_poll.stress_run
12.00 -7.0 5.02 ± 9% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.read.stress_poll.stress_run
10.32 -6.7 3.60 ± 17% perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write.stress_run
10.86 -6.7 4.16 ± 9% perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read.stress_poll
10.38 -6.6 3.81 ± 9% perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
9.91 -6.4 3.52 ± 8% perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
7.32 -6.3 0.97 ± 3% perf-profile.calltrace.cycles-pp.flush_smp_call_function_queue.do_idle.cpu_startup_entry.start_secondary.common_startup_64
9.45 -6.3 3.13 ± 9% perf-profile.calltrace.cycles-pp.pipe_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
9.21 -6.2 3.00 ± 8% perf-profile.calltrace.cycles-pp.pipe_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
6.76 ± 2% -5.8 0.91 ± 3% perf-profile.calltrace.cycles-pp.__flush_smp_call_function_queue.flush_smp_call_function_queue.do_idle.cpu_startup_entry.start_secondary
6.30 ± 2% -5.4 0.88 ± 3% perf-profile.calltrace.cycles-pp.sched_ttwu_pending.__flush_smp_call_function_queue.flush_smp_call_function_queue.do_idle.cpu_startup_entry
5.61 ± 2% -4.8 0.84 ± 3% perf-profile.calltrace.cycles-pp.ttwu_do_activate.sched_ttwu_pending.__flush_smp_call_function_queue.flush_smp_call_function_queue.do_idle
4.97 ± 2% -4.2 0.78 ± 3% perf-profile.calltrace.cycles-pp.enqueue_task.ttwu_do_activate.sched_ttwu_pending.__flush_smp_call_function_queue.flush_smp_call_function_queue
4.84 ± 3% -4.1 0.77 ± 3% perf-profile.calltrace.cycles-pp.enqueue_task_fair.enqueue_task.ttwu_do_activate.sched_ttwu_pending.__flush_smp_call_function_queue
3.16 ± 4% -2.6 0.61 ± 5% perf-profile.calltrace.cycles-pp.dl_server_start.enqueue_task_fair.enqueue_task.ttwu_do_activate.sched_ttwu_pending
2.57 -1.4 1.13 ± 7% perf-profile.calltrace.cycles-pp.dequeue_entity.dequeue_entities.dequeue_task_fair.try_to_block_task.__schedule
3.45 -1.2 2.20 ± 8% perf-profile.calltrace.cycles-pp.core_sys_select.do_pselect.__x64_sys_pselect6.do_syscall_64.entry_SYSCALL_64_after_hwframe
2.70 -1.2 1.50 ± 8% perf-profile.calltrace.cycles-pp.__poll.stress_run
2.45 -1.1 1.31 ± 8% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__poll.stress_run
2.43 -1.1 1.30 ± 8% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll.stress_run
2.77 -1.1 1.65 ± 8% perf-profile.calltrace.cycles-pp.do_select.core_sys_select.do_pselect.__x64_sys_pselect6.do_syscall_64
3.70 -1.1 2.58 ± 8% perf-profile.calltrace.cycles-pp.ppoll.stress_run
2.92 -1.1 1.82 ± 8% perf-profile.calltrace.cycles-pp.__select
2.30 -1.1 1.20 ± 8% perf-profile.calltrace.cycles-pp.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll.stress_run
2.13 -1.1 1.08 ± 8% perf-profile.calltrace.cycles-pp.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
2.67 -1.0 1.63 ± 8% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__select
2.65 -1.0 1.62 ± 8% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__select
2.54 -1.0 1.54 ± 8% perf-profile.calltrace.cycles-pp.__x64_sys_pselect6.do_syscall_64.entry_SYSCALL_64_after_hwframe.__select
2.51 -1.0 1.51 ± 9% perf-profile.calltrace.cycles-pp.do_pselect.__x64_sys_pselect6.do_syscall_64.entry_SYSCALL_64_after_hwframe.__select
2.91 -0.9 2.02 ± 8% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.ppoll.stress_run
2.86 -0.9 1.98 ± 8% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.ppoll.stress_run
1.35 -0.9 0.49 ± 45% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.clock_nanosleep
2.53 -0.8 1.72 ± 8% perf-profile.calltrace.cycles-pp.__x64_sys_ppoll.do_syscall_64.entry_SYSCALL_64_after_hwframe.ppoll.stress_run
1.07 -0.7 0.38 ± 70% perf-profile.calltrace.cycles-pp.mutex_lock.pipe_read.vfs_read.ksys_read.do_syscall_64
2.01 -0.6 1.36 ± 28% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.write.stress_run
2.57 -0.6 1.94 ± 9% perf-profile.calltrace.cycles-pp.pselect.stress_run
1.24 -0.6 0.64 ± 9% perf-profile.calltrace.cycles-pp.do_poll.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
2.32 -0.6 1.75 ± 9% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.pselect.stress_run
2.30 -0.6 1.74 ± 8% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.pselect.stress_run
2.18 -0.5 1.65 ± 9% perf-profile.calltrace.cycles-pp.__x64_sys_pselect6.do_syscall_64.entry_SYSCALL_64_after_hwframe.pselect.stress_run
2.10 -0.5 1.58 ± 9% perf-profile.calltrace.cycles-pp.do_pselect.__x64_sys_pselect6.do_syscall_64.entry_SYSCALL_64_after_hwframe.pselect
1.59 -0.5 1.08 ± 29% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.read.stress_poll.stress_run
0.72 -0.5 0.26 ±100% perf-profile.calltrace.cycles-pp.__getrlimit.stress_run
1.46 -0.5 1.00 ± 9% perf-profile.calltrace.cycles-pp.copy_page_to_iter.pipe_read.vfs_read.ksys_read.do_syscall_64
1.30 -0.4 0.86 ± 9% perf-profile.calltrace.cycles-pp._copy_to_iter.copy_page_to_iter.pipe_read.vfs_read.ksys_read
0.67 -0.4 0.26 ±100% perf-profile.calltrace.cycles-pp.__wake_up_sync_key.pipe_write.vfs_write.ksys_write.do_syscall_64
0.65 -0.4 0.26 ±100% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.write.stress_run
0.65 -0.4 0.26 ±100% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.read.stress_poll
0.77 -0.4 0.40 ± 71% perf-profile.calltrace.cycles-pp._copy_from_iter.copy_page_from_iter.pipe_write.vfs_write.ksys_write
0.93 -0.3 0.58 ± 45% perf-profile.calltrace.cycles-pp.copy_page_from_iter.pipe_write.vfs_write.ksys_write.do_syscall_64
0.70 -0.3 0.36 ± 70% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.setrlimit64.stress_run
0.62 ± 3% -0.3 0.29 ±100% perf-profile.calltrace.cycles-pp.pick_task_fair.pick_next_task_fair.__pick_next_task.__schedule.schedule
0.60 ± 4% -0.3 0.28 ±100% perf-profile.calltrace.cycles-pp.dequeue_entities.pick_task_fair.pick_next_task_fair.__pick_next_task.__schedule
1.09 -0.3 0.77 ± 8% perf-profile.calltrace.cycles-pp.touch_atime.pipe_read.vfs_read.ksys_read.do_syscall_64
0.56 ± 4% -0.3 0.27 ±100% perf-profile.calltrace.cycles-pp.dl_server_stop.dequeue_entities.pick_task_fair.pick_next_task_fair.__pick_next_task
1.15 -0.3 0.87 ± 8% perf-profile.calltrace.cycles-pp.setrlimit64.stress_run
0.74 -0.3 0.47 ± 45% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.setrlimit64.stress_run
0.94 -0.3 0.68 ± 9% perf-profile.calltrace.cycles-pp.do_sys_poll.__x64_sys_ppoll.do_syscall_64.entry_SYSCALL_64_after_hwframe.ppoll
0.88 -0.2 0.65 ± 8% perf-profile.calltrace.cycles-pp.atime_needs_update.touch_atime.pipe_read.vfs_read.ksys_read
0.95 -0.1 0.88 ± 5% perf-profile.calltrace.cycles-pp.queue_event.ordered_events__queue.process_simple.reader__read_event.perf_session__process_events
0.95 -0.1 0.88 ± 5% perf-profile.calltrace.cycles-pp.ordered_events__queue.process_simple.reader__read_event.perf_session__process_events.record__finish_output
0.96 -0.1 0.90 ± 5% perf-profile.calltrace.cycles-pp.process_simple.reader__read_event.perf_session__process_events.record__finish_output.__cmd_record
0.97 -0.1 0.91 ± 5% perf-profile.calltrace.cycles-pp.reader__read_event.perf_session__process_events.record__finish_output.__cmd_record
0.97 -0.1 0.91 ± 5% perf-profile.calltrace.cycles-pp.__cmd_record
0.97 -0.1 0.91 ± 5% perf-profile.calltrace.cycles-pp.perf_session__process_events.record__finish_output.__cmd_record
0.97 -0.1 0.91 ± 5% perf-profile.calltrace.cycles-pp.record__finish_output.__cmd_record
0.00 +0.7 0.65 ± 10% perf-profile.calltrace.cycles-pp.timerqueue_add.enqueue_hrtimer.__hrtimer_start_range_ns.hrtimer_start_range_ns.start_dl_timer
0.00 +0.7 0.67 ± 9% perf-profile.calltrace.cycles-pp.enqueue_hrtimer.__hrtimer_start_range_ns.hrtimer_start_range_ns.start_dl_timer.enqueue_dl_entity
0.00 +0.7 0.70 ± 11% perf-profile.calltrace.cycles-pp.update_sg_lb_stats.update_sd_lb_stats.sched_balance_find_src_group.sched_balance_rq.sched_balance_newidle
1.11 ± 8% +0.7 1.82 ± 5% perf-profile.calltrace.cycles-pp.__pick_next_task.__schedule.schedule.do_nanosleep.hrtimer_nanosleep
0.99 ± 9% +0.7 1.72 ± 6% perf-profile.calltrace.cycles-pp.pick_next_task_fair.__pick_next_task.__schedule.schedule.do_nanosleep
0.00 +0.7 0.74 ± 11% perf-profile.calltrace.cycles-pp.update_sd_lb_stats.sched_balance_find_src_group.sched_balance_rq.sched_balance_newidle.pick_next_task_fair
0.00 +0.8 0.75 ± 11% perf-profile.calltrace.cycles-pp.sched_balance_find_src_group.sched_balance_rq.sched_balance_newidle.pick_next_task_fair.__pick_next_task
0.40 ± 71% +0.8 1.15 ± 9% perf-profile.calltrace.cycles-pp.sched_balance_newidle.pick_next_task_fair.__pick_next_task.__schedule.schedule
0.00 +1.0 0.95 ± 10% perf-profile.calltrace.cycles-pp.sched_balance_rq.sched_balance_newidle.pick_next_task_fair.__pick_next_task.__schedule
0.08 ±223% +1.0 1.05 ± 10% perf-profile.calltrace.cycles-pp.enqueue_entity.enqueue_task_fair.enqueue_task.ttwu_do_activate.try_to_wake_up
0.75 ± 2% +3.1 3.88 ± 26% perf-profile.calltrace.cycles-pp.finish_task_switch.__schedule.schedule_idle.do_idle.cpu_startup_entry
0.09 ±223% +3.7 3.83 ± 27% perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.finish_task_switch.__schedule.schedule_idle.do_idle
0.00 +3.8 3.77 ± 27% perf-profile.calltrace.cycles-pp.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.finish_task_switch
0.00 +3.8 3.78 ± 27% perf-profile.calltrace.cycles-pp.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.finish_task_switch.__schedule
0.00 +3.8 3.80 ± 27% perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.finish_task_switch.__schedule.schedule_idle
39.66 +6.2 45.84 ± 2% perf-profile.calltrace.cycles-pp.common_startup_64
39.42 +6.2 45.61 ± 3% perf-profile.calltrace.cycles-pp.start_secondary.common_startup_64
39.39 +6.2 45.60 ± 3% perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.common_startup_64
39.32 +6.3 45.57 ± 3% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.common_startup_64
3.96 ± 4% +8.8 12.76 ± 2% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.__hrtimer_start_range_ns.hrtimer_start_range_ns.start_dl_timer
4.16 ± 4% +8.9 13.02 ± 2% perf-profile.calltrace.cycles-pp._raw_spin_lock.__hrtimer_start_range_ns.hrtimer_start_range_ns.start_dl_timer.enqueue_dl_entity
5.00 ± 3% +9.2 14.23 perf-profile.calltrace.cycles-pp.__hrtimer_start_range_ns.hrtimer_start_range_ns.start_dl_timer.enqueue_dl_entity.dl_server_start
3.14 ± 8% +12.2 15.34 ± 16% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.hrtimer_start_range_ns.start_dl_timer.enqueue_dl_entity
3.33 ± 8% +12.4 15.73 ± 15% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.hrtimer_start_range_ns.start_dl_timer.enqueue_dl_entity.dl_server_start
26.25 +12.6 38.87 ± 2% perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.common_startup_64
24.91 +13.0 37.86 ± 3% perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary
24.29 +13.0 37.33 ± 3% perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry
8.54 +14.8 23.30 ± 4% perf-profile.calltrace.cycles-pp.dequeue_entities.dequeue_task_fair.try_to_block_task.__schedule.schedule
13.88 +15.3 29.13 ± 2% perf-profile.calltrace.cycles-pp.clock_nanosleep
12.88 +15.6 28.47 ± 3% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.clock_nanosleep
12.90 +15.6 28.49 ± 3% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.clock_nanosleep
4.28 ± 10% +16.2 20.46 ± 7% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.hrtimer_try_to_cancel.dl_server_stop.dequeue_entities
5.70 ± 3% +16.3 22.00 ± 5% perf-profile.calltrace.cycles-pp.dl_server_stop.dequeue_entities.dequeue_task_fair.try_to_block_task.__schedule
4.20 ± 4% +16.4 20.62 ± 6% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.hrtimer_try_to_cancel.dl_server_stop.dequeue_entities.dequeue_task_fair
11.34 +16.5 27.80 ± 3% perf-profile.calltrace.cycles-pp.__x64_sys_clock_nanosleep.do_syscall_64.entry_SYSCALL_64_after_hwframe.clock_nanosleep
11.04 +16.5 27.58 ± 3% perf-profile.calltrace.cycles-pp.common_nsleep.__x64_sys_clock_nanosleep.do_syscall_64.entry_SYSCALL_64_after_hwframe.clock_nanosleep
11.03 +16.5 27.57 ± 3% perf-profile.calltrace.cycles-pp.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep.do_syscall_64.entry_SYSCALL_64_after_hwframe
10.88 +16.6 27.46 ± 3% perf-profile.calltrace.cycles-pp.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep.do_syscall_64
4.56 ± 4% +16.7 21.23 ± 6% perf-profile.calltrace.cycles-pp.hrtimer_try_to_cancel.dl_server_stop.dequeue_entities.dequeue_task_fair.try_to_block_task
6.60 ± 2% +16.8 23.41 ± 4% perf-profile.calltrace.cycles-pp.try_to_block_task.__schedule.schedule.do_nanosleep.hrtimer_nanosleep
6.56 ± 2% +16.8 23.38 ± 4% perf-profile.calltrace.cycles-pp.dequeue_task_fair.try_to_block_task.__schedule.schedule.do_nanosleep
9.56 +16.9 26.44 ± 4% perf-profile.calltrace.cycles-pp.schedule.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep
9.47 +16.9 26.38 ± 4% perf-profile.calltrace.cycles-pp.__schedule.schedule.do_nanosleep.hrtimer_nanosleep.common_nsleep
10.92 +20.9 31.83 ± 4% perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
10.03 ± 2% +21.0 31.00 ± 5% perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call
9.54 ± 2% +21.0 30.58 ± 5% perf-profile.calltrace.cycles-pp.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter
9.38 ± 2% +21.0 30.43 ± 5% perf-profile.calltrace.cycles-pp.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state
9.54 ± 4% +21.4 30.97 ± 8% perf-profile.calltrace.cycles-pp.enqueue_dl_entity.dl_server_start.enqueue_task_fair.enqueue_task.ttwu_do_activate
9.13 ± 4% +21.6 30.74 ± 8% perf-profile.calltrace.cycles-pp.start_dl_timer.enqueue_dl_entity.dl_server_start.enqueue_task_fair.enqueue_task
8.84 ± 4% +21.7 30.56 ± 8% perf-profile.calltrace.cycles-pp.hrtimer_start_range_ns.start_dl_timer.enqueue_dl_entity.dl_server_start.enqueue_task_fair
6.41 ± 3% +24.0 30.41 ± 8% perf-profile.calltrace.cycles-pp.dl_server_start.enqueue_task_fair.enqueue_task.ttwu_do_activate.try_to_wake_up
8.60 ± 2% +24.8 33.35 ± 7% perf-profile.calltrace.cycles-pp.hrtimer_wakeup.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt
8.58 ± 2% +24.8 33.33 ± 7% perf-profile.calltrace.cycles-pp.try_to_wake_up.hrtimer_wakeup.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt
7.13 ± 3% +24.8 31.91 ± 7% perf-profile.calltrace.cycles-pp.enqueue_task_fair.enqueue_task.ttwu_do_activate.try_to_wake_up.hrtimer_wakeup
7.15 ± 3% +24.8 31.94 ± 7% perf-profile.calltrace.cycles-pp.enqueue_task.ttwu_do_activate.try_to_wake_up.hrtimer_wakeup.__hrtimer_run_queues
7.19 ± 3% +24.8 31.99 ± 7% perf-profile.calltrace.cycles-pp.ttwu_do_activate.try_to_wake_up.hrtimer_wakeup.__hrtimer_run_queues.hrtimer_interrupt
8.91 ± 2% +24.8 33.76 ± 7% perf-profile.calltrace.cycles-pp.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
40.62 -19.6 21.06 ± 10% perf-profile.children.cycles-pp.stress_run
16.04 -8.1 7.94 ± 9% perf-profile.children.cycles-pp.stress_poll
13.84 -7.6 6.21 ± 8% perf-profile.children.cycles-pp.write
14.59 -7.6 7.03 ± 9% perf-profile.children.cycles-pp.read
12.53 -7.5 4.99 ± 10% perf-profile.children.cycles-pp.intel_idle
10.91 -6.7 4.20 ± 9% perf-profile.children.cycles-pp.ksys_read
10.44 -6.6 3.85 ± 9% perf-profile.children.cycles-pp.vfs_read
10.41 -6.5 3.90 ± 8% perf-profile.children.cycles-pp.ksys_write
10.00 -6.5 3.53 ± 9% perf-profile.children.cycles-pp.pipe_read
7.51 -6.4 1.07 ± 3% perf-profile.children.cycles-pp.__flush_smp_call_function_queue
7.42 -6.4 0.98 ± 4% perf-profile.children.cycles-pp.flush_smp_call_function_queue
10.00 -6.4 3.59 ± 8% perf-profile.children.cycles-pp.vfs_write
9.34 -6.3 3.08 ± 8% perf-profile.children.cycles-pp.pipe_write
6.96 ± 2% -5.9 1.03 ± 3% perf-profile.children.cycles-pp.sched_ttwu_pending
5.18 -4.7 0.53 ± 8% perf-profile.children.cycles-pp.__wake_up_sync_key
4.50 -4.4 0.12 ± 10% perf-profile.children.cycles-pp.__wake_up_common
48.46 -2.3 46.12 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
2.58 -2.3 0.30 ± 6% perf-profile.children.cycles-pp.select_task_rq
47.90 -2.2 45.72 perf-profile.children.cycles-pp.do_syscall_64
2.35 -2.1 0.27 ± 6% perf-profile.children.cycles-pp.select_task_rq_fair
2.03 -1.8 0.22 ± 5% perf-profile.children.cycles-pp.select_idle_sibling
4.74 -1.5 3.19 ± 8% perf-profile.children.cycles-pp.__x64_sys_pselect6
4.63 -1.5 3.10 ± 8% perf-profile.children.cycles-pp.do_pselect
2.68 -1.5 1.17 ± 7% perf-profile.children.cycles-pp.dequeue_entity
3.11 -1.3 1.80 ± 8% perf-profile.children.cycles-pp.do_sys_poll
3.35 -1.3 2.05 ± 8% perf-profile.children.cycles-pp.syscall_exit_to_user_mode
2.34 -1.3 1.07 ± 9% perf-profile.children.cycles-pp.pipe_poll
3.48 -1.3 2.22 ± 9% perf-profile.children.cycles-pp.core_sys_select
2.82 -1.2 1.60 ± 8% perf-profile.children.cycles-pp.__poll
4.00 -1.2 2.82 ± 8% perf-profile.children.cycles-pp.ppoll
3.05 -1.1 1.92 ± 8% perf-profile.children.cycles-pp.__select
2.82 -1.1 1.69 ± 8% perf-profile.children.cycles-pp.do_select
2.32 -1.1 1.22 ± 8% perf-profile.children.cycles-pp.__x64_sys_poll
1.17 -1.0 0.15 ± 6% perf-profile.children.cycles-pp.available_idle_cpu
1.85 -1.0 0.84 ± 7% perf-profile.children.cycles-pp.update_load_avg
1.05 -1.0 0.08 ± 10% perf-profile.children.cycles-pp.ttwu_queue_wakelist
1.12 -0.8 0.27 ± 8% perf-profile.children.cycles-pp.switch_mm_irqs_off
3.48 -0.8 2.67 ± 8% perf-profile.children.cycles-pp.entry_SYSCALL_64
3.11 ± 3% -0.8 2.30 ± 4% perf-profile.children.cycles-pp.__pick_next_task
2.55 -0.8 1.74 ± 8% perf-profile.children.cycles-pp.__x64_sys_ppoll
2.07 -0.8 1.27 ± 8% perf-profile.children.cycles-pp.enqueue_entity
1.85 -0.8 1.07 ± 8% perf-profile.children.cycles-pp.do_poll
0.77 -0.7 0.03 ± 70% perf-profile.children.cycles-pp.__smp_call_single_queue
1.75 -0.7 1.03 ± 9% perf-profile.children.cycles-pp.mutex_lock
2.84 ± 3% -0.7 2.14 ± 4% perf-profile.children.cycles-pp.pick_next_task_fair
0.88 -0.7 0.19 ± 3% perf-profile.children.cycles-pp.asm_sysvec_call_function_single
2.67 -0.7 2.02 ± 8% perf-profile.children.cycles-pp.pselect
0.92 -0.6 0.31 ± 7% perf-profile.children.cycles-pp.prepare_task_switch
0.89 -0.6 0.28 ± 9% perf-profile.children.cycles-pp.__switch_to
0.68 -0.6 0.07 ± 8% perf-profile.children.cycles-pp.set_task_cpu
0.77 -0.6 0.17 ± 4% perf-profile.children.cycles-pp.sysvec_call_function_single
0.65 -0.6 0.07 ± 9% perf-profile.children.cycles-pp.sched_mm_cid_migrate_to
0.72 -0.6 0.16 ± 3% perf-profile.children.cycles-pp.__sysvec_call_function_single
1.13 ± 3% -0.6 0.57 ± 9% perf-profile.children.cycles-pp.pick_task_fair
0.86 ± 3% -0.5 0.32 ± 8% perf-profile.children.cycles-pp.update_cfs_group
0.76 -0.5 0.22 ± 6% perf-profile.children.cycles-pp.__switch_to_asm
1.57 -0.5 1.05 ± 8% perf-profile.children.cycles-pp._copy_from_user
1.28 -0.5 0.77 ± 8% perf-profile.children.cycles-pp.read_tsc
0.80 -0.5 0.32 ± 9% perf-profile.children.cycles-pp.__pollwait
1.50 -0.5 1.03 ± 9% perf-profile.children.cycles-pp.copy_page_to_iter
1.00 -0.5 0.52 ± 6% perf-profile.children.cycles-pp.update_curr
0.98 -0.5 0.51 ± 10% perf-profile.children.cycles-pp.ktime_get
1.31 -0.4 0.87 ± 9% perf-profile.children.cycles-pp._copy_to_iter
0.76 -0.4 0.34 ± 8% perf-profile.children.cycles-pp.update_rq_clock
0.64 -0.4 0.23 ± 10% perf-profile.children.cycles-pp.native_sched_clock
0.56 -0.4 0.16 ± 7% perf-profile.children.cycles-pp.switch_fpu_return
0.76 -0.4 0.37 ± 9% perf-profile.children.cycles-pp.add_wait_queue
0.64 -0.4 0.25 ± 8% perf-profile.children.cycles-pp.sched_clock_cpu
1.56 -0.3 1.22 ± 8% perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
0.56 -0.3 0.22 ± 10% perf-profile.children.cycles-pp.sched_clock
0.47 -0.3 0.13 ± 8% perf-profile.children.cycles-pp.do_perf_trace_sched_wakeup_template
0.83 -0.3 0.49 ± 8% perf-profile.children.cycles-pp.set_user_sigmask
1.34 -0.3 1.02 ± 8% perf-profile.children.cycles-pp.setrlimit64
1.12 -0.3 0.80 ± 8% perf-profile.children.cycles-pp.touch_atime
0.40 -0.3 0.10 ± 8% perf-profile.children.cycles-pp.restore_fpregs_from_fpstate
0.86 -0.3 0.56 ± 8% perf-profile.children.cycles-pp.__do_sys_prlimit64
0.59 -0.3 0.31 ± 4% perf-profile.children.cycles-pp.set_next_task_fair
0.67 -0.3 0.40 ± 8% perf-profile.children.cycles-pp.fdget
0.56 -0.3 0.29 ± 4% perf-profile.children.cycles-pp.set_next_entity
0.45 -0.3 0.18 ± 8% perf-profile.children.cycles-pp.__update_load_avg_se
0.82 -0.3 0.55 ± 8% perf-profile.children.cycles-pp.__getrlimit
0.94 -0.2 0.70 ± 8% perf-profile.children.cycles-pp.atime_needs_update
0.54 -0.2 0.29 ± 9% perf-profile.children.cycles-pp.hrtimer_active
0.91 -0.2 0.66 ± 8% perf-profile.children.cycles-pp.ktime_get_ts64
0.95 -0.2 0.72 ± 8% perf-profile.children.cycles-pp.copy_page_from_iter
0.93 -0.2 0.70 ± 9% perf-profile.children.cycles-pp.poll_select_finish
0.29 -0.2 0.07 ± 10% perf-profile.children.cycles-pp.perf_tp_event
0.34 -0.2 0.13 ± 7% perf-profile.children.cycles-pp.tick_nohz_idle_enter
0.47 -0.2 0.26 ± 9% perf-profile.children.cycles-pp.__rseq_handle_notify_resume
0.64 ± 2% -0.2 0.44 ± 11% perf-profile.children.cycles-pp.get_nohz_timer_target
0.78 -0.2 0.58 ± 8% perf-profile.children.cycles-pp._copy_from_iter
0.28 ± 2% -0.2 0.09 ± 7% perf-profile.children.cycles-pp.___perf_sw_event
0.36 -0.2 0.18 ± 9% perf-profile.children.cycles-pp.rseq_ip_fixup
0.39 ± 3% -0.2 0.20 ± 11% perf-profile.children.cycles-pp.task_contending
0.67 -0.2 0.49 ± 8% perf-profile.children.cycles-pp.get_timespec64
0.54 -0.2 0.37 ± 8% perf-profile.children.cycles-pp.syscall_return_via_sysret
0.63 -0.2 0.46 ± 9% perf-profile.children.cycles-pp.mutex_unlock
0.26 ± 2% -0.2 0.10 ± 11% perf-profile.children.cycles-pp.__get_user_8
0.27 ± 2% -0.2 0.11 ± 10% perf-profile.children.cycles-pp.rseq_get_rseq_cs
0.31 ± 2% -0.2 0.15 ± 8% perf-profile.children.cycles-pp.__nanosleep
0.76 -0.2 0.60 ± 8% perf-profile.children.cycles-pp.current_time
0.58 ± 3% -0.2 0.43 ± 8% perf-profile.children.cycles-pp.idle_cpu
0.25 -0.2 0.10 ± 11% perf-profile.children.cycles-pp.__wrgsbase_inactive
0.32 -0.1 0.17 ± 7% perf-profile.children.cycles-pp.do_prlimit
0.48 -0.1 0.34 ± 9% perf-profile.children.cycles-pp.fdget_pos
0.59 -0.1 0.44 ± 8% perf-profile.children.cycles-pp.file_update_time
0.64 -0.1 0.50 ± 8% perf-profile.children.cycles-pp.poll_freewait
0.54 -0.1 0.40 ± 8% perf-profile.children.cycles-pp.select_estimate_accuracy
0.24 -0.1 0.11 ± 6% perf-profile.children.cycles-pp.sleep
0.35 -0.1 0.22 ± 8% perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore
0.49 -0.1 0.36 ± 9% perf-profile.children.cycles-pp.rw_verify_area
0.22 ± 2% -0.1 0.09 ± 7% perf-profile.children.cycles-pp.reweight_entity
0.87 ± 2% -0.1 0.75 ± 7% perf-profile.children.cycles-pp.raw_spin_rq_lock_nested
0.48 -0.1 0.36 ± 9% perf-profile.children.cycles-pp.inode_needs_update_time
0.17 -0.1 0.06 ± 11% perf-profile.children.cycles-pp.local_clock_noinstr
0.32 -0.1 0.21 ± 6% perf-profile.children.cycles-pp.__irq_exit_rcu
0.67 -0.1 0.57 ± 9% perf-profile.children.cycles-pp.clockevents_program_event
0.25 -0.1 0.14 ± 8% perf-profile.children.cycles-pp.update_rq_clock_task
0.13 ± 3% -0.1 0.03 ± 70% perf-profile.children.cycles-pp.__rdgsbase_inactive
0.18 ± 2% -0.1 0.08 ± 11% perf-profile.children.cycles-pp.update_entity_lag
0.38 -0.1 0.29 ± 9% perf-profile.children.cycles-pp.remove_wait_queue
0.48 -0.1 0.39 ± 7% perf-profile.children.cycles-pp._raw_spin_lock_irq
0.29 ± 3% -0.1 0.20 ± 7% perf-profile.children.cycles-pp.tick_nohz_handler
0.12 -0.1 0.03 ± 70% perf-profile.children.cycles-pp.ct_kernel_enter
0.40 -0.1 0.32 ± 8% perf-profile.children.cycles-pp.native_irq_return_iret
0.14 -0.1 0.06 ± 8% perf-profile.children.cycles-pp.ct_idle_exit
0.28 -0.1 0.20 ± 8% perf-profile.children.cycles-pp.__set_current_blocked
0.44 ± 2% -0.1 0.36 ± 8% perf-profile.children.cycles-pp.x64_sys_call
0.12 ± 3% -0.1 0.04 ± 45% perf-profile.children.cycles-pp.__calc_delta
0.26 ± 2% -0.1 0.18 ± 7% perf-profile.children.cycles-pp.update_process_times
0.32 -0.1 0.24 ± 8% perf-profile.children.cycles-pp._copy_to_user
0.32 -0.1 0.25 ± 9% perf-profile.children.cycles-pp.__cond_resched
0.32 -0.1 0.25 ± 8% perf-profile.children.cycles-pp.ktime_get_coarse_real_ts64_mg
0.13 ± 3% -0.1 0.06 ± 6% perf-profile.children.cycles-pp.__dequeue_entity
0.12 ± 3% -0.1 0.05 ± 45% perf-profile.children.cycles-pp.call_cpuidle
0.28 -0.1 0.22 ± 9% perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
0.28 -0.1 0.21 ± 7% perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
0.18 ± 2% -0.1 0.12 ± 4% perf-profile.children.cycles-pp.handle_softirqs
0.95 -0.1 0.88 ± 5% perf-profile.children.cycles-pp.queue_event
0.12 -0.1 0.06 ± 8% perf-profile.children.cycles-pp.update_min_vruntime
0.95 -0.1 0.88 ± 5% perf-profile.children.cycles-pp.ordered_events__queue
0.30 -0.1 0.23 ± 9% perf-profile.children.cycles-pp.put_timespec64
0.20 ± 2% -0.1 0.13 ± 5% perf-profile.children.cycles-pp.place_entity
0.96 -0.1 0.90 ± 5% perf-profile.children.cycles-pp.process_simple
0.10 ± 5% -0.1 0.04 ± 71% perf-profile.children.cycles-pp.tick_nohz_stop_idle
0.97 -0.1 0.91 ± 5% perf-profile.children.cycles-pp.perf_session__process_events
0.97 -0.1 0.91 ± 5% perf-profile.children.cycles-pp.reader__read_event
0.97 -0.1 0.91 ± 5% perf-profile.children.cycles-pp.record__finish_output
0.47 -0.1 0.42 ± 9% perf-profile.children.cycles-pp.lapic_next_deadline
0.19 ± 2% -0.0 0.14 ± 10% perf-profile.children.cycles-pp.recalc_sigpending
0.20 -0.0 0.15 ± 9% perf-profile.children.cycles-pp.native_apic_msr_eoi
0.18 ± 4% -0.0 0.14 ± 8% perf-profile.children.cycles-pp.__enqueue_entity
0.14 ± 2% -0.0 0.10 ± 8% perf-profile.children.cycles-pp.poll_select_set_timeout
0.10 -0.0 0.06 ± 13% perf-profile.children.cycles-pp.task_non_contending
0.15 -0.0 0.11 ± 10% perf-profile.children.cycles-pp.os_xsave
0.10 ± 3% -0.0 0.06 ± 11% perf-profile.children.cycles-pp.__mutex_lock
0.16 ± 2% -0.0 0.12 ± 9% perf-profile.children.cycles-pp.fput
0.19 -0.0 0.15 ± 9% perf-profile.children.cycles-pp.security_file_permission
0.15 -0.0 0.12 ± 8% perf-profile.children.cycles-pp.irqtime_account_irq
0.17 ± 2% -0.0 0.14 ± 8% perf-profile.children.cycles-pp.rcu_all_qs
0.12 ± 3% -0.0 0.09 ± 5% perf-profile.children.cycles-pp.vruntime_eligible
0.14 -0.0 0.11 ± 12% perf-profile.children.cycles-pp.__check_object_size
0.06 ± 7% -0.0 0.03 ± 70% perf-profile.children.cycles-pp.__put_user_8
0.11 -0.0 0.08 ± 10% perf-profile.children.cycles-pp.__fdelt_warn
0.14 ± 3% -0.0 0.11 ± 9% perf-profile.children.cycles-pp.__memset
0.12 ± 4% -0.0 0.09 ± 10% perf-profile.children.cycles-pp.kill_fasync
0.14 -0.0 0.11 ± 11% perf-profile.children.cycles-pp.avg_vruntime
0.09 -0.0 0.06 ± 11% perf-profile.children.cycles-pp.task_mm_cid_work
0.06 -0.0 0.03 ± 70% perf-profile.children.cycles-pp.make_vfsgid
0.09 ± 4% -0.0 0.07 ± 7% perf-profile.children.cycles-pp.task_work_run
0.08 -0.0 0.06 ± 8% perf-profile.children.cycles-pp.__fdelt_chk@plt
0.10 ± 5% -0.0 0.07 ± 10% perf-profile.children.cycles-pp.rseq_update_cpu_node_id
0.07 -0.0 0.05 perf-profile.children.cycles-pp.put_prev_entity
0.07 -0.0 0.05 ± 7% perf-profile.children.cycles-pp.update_irq_load_avg
0.07 -0.0 0.06 ± 9% perf-profile.children.cycles-pp.get_sigset_argpack
0.08 ± 4% -0.0 0.07 ± 7% perf-profile.children.cycles-pp.put_prev_task_fair
0.07 -0.0 0.06 ± 8% perf-profile.children.cycles-pp.sched_balance_domains
0.07 ± 9% +0.0 0.10 ± 6% perf-profile.children.cycles-pp._find_next_and_bit
0.06 ± 6% +0.0 0.09 ± 30% perf-profile.children.cycles-pp.handle_internal_command
0.06 ± 6% +0.0 0.09 ± 30% perf-profile.children.cycles-pp.main
0.06 ± 6% +0.0 0.09 ± 30% perf-profile.children.cycles-pp.run_builtin
0.05 +0.0 0.08 ± 31% perf-profile.children.cycles-pp.cmd_record
0.00 +0.1 0.06 ± 31% perf-profile.children.cycles-pp.perf_mmap__push
0.00 +0.1 0.07 ± 36% perf-profile.children.cycles-pp.record__mmap_read_evlist
0.48 +0.1 0.57 ± 8% perf-profile.children.cycles-pp.timerqueue_del
0.12 ± 3% +0.1 0.22 ± 19% perf-profile.children.cycles-pp.hrtimer_next_event_without
0.00 +0.1 0.11 ± 8% perf-profile.children.cycles-pp.sched_balance_find_src_rq
0.41 +0.1 0.52 ± 8% perf-profile.children.cycles-pp.rb_erase
0.08 ± 6% +0.1 0.21 ± 6% perf-profile.children.cycles-pp.update_curr_dl_se
0.24 ± 4% +0.2 0.43 ± 19% perf-profile.children.cycles-pp.__get_next_timer_interrupt
0.30 ± 3% +0.2 0.48 ± 18% perf-profile.children.cycles-pp.tick_nohz_next_event
0.12 ± 7% +0.2 0.32 ± 22% perf-profile.children.cycles-pp.hrtimer_get_next_event
0.45 ± 2% +0.3 0.74 ± 18% perf-profile.children.cycles-pp.tick_nohz_get_sleep_length
0.15 +0.3 0.46 ± 22% perf-profile.children.cycles-pp.poll_idle
0.80 ± 16% +0.4 1.17 ± 9% perf-profile.children.cycles-pp.sched_balance_newidle
0.38 ± 28% +0.4 0.77 ± 11% perf-profile.children.cycles-pp.update_sd_lb_stats
0.36 ± 30% +0.4 0.74 ± 11% perf-profile.children.cycles-pp.update_sg_lb_stats
0.40 ± 27% +0.4 0.79 ± 11% perf-profile.children.cycles-pp.sched_balance_find_src_group
0.47 ± 24% +0.5 1.00 ± 9% perf-profile.children.cycles-pp.sched_balance_rq
1.51 +2.6 4.10 ± 25% perf-profile.children.cycles-pp.finish_task_switch
39.66 +6.2 45.84 ± 2% perf-profile.children.cycles-pp.common_startup_64
39.66 +6.2 45.84 ± 2% perf-profile.children.cycles-pp.cpu_startup_entry
39.42 +6.2 45.61 ± 3% perf-profile.children.cycles-pp.start_secondary
39.60 +6.2 45.82 ± 2% perf-profile.children.cycles-pp.do_idle
5.60 ± 2% +8.7 14.34 perf-profile.children.cycles-pp._raw_spin_lock
6.49 ± 2% +8.8 15.24 perf-profile.children.cycles-pp.__hrtimer_start_range_ns
13.94 +12.5 26.47 ± 4% perf-profile.children.cycles-pp.schedule
26.42 +12.7 39.07 ± 3% perf-profile.children.cycles-pp.cpuidle_idle_call
25.01 +13.0 37.98 ± 3% perf-profile.children.cycles-pp.cpuidle_enter
24.98 +13.0 37.97 ± 3% perf-profile.children.cycles-pp.cpuidle_enter_state
18.29 +13.4 31.65 ± 6% perf-profile.children.cycles-pp.__schedule
9.51 ± 2% +14.3 23.82 ± 5% perf-profile.children.cycles-pp.dequeue_entities
8.69 +14.7 23.42 ± 4% perf-profile.children.cycles-pp.try_to_block_task
8.63 +14.8 23.40 ± 4% perf-profile.children.cycles-pp.dequeue_task_fair
14.16 +15.1 29.27 ± 2% perf-profile.children.cycles-pp.clock_nanosleep
6.58 ± 3% +15.9 22.49 ± 5% perf-profile.children.cycles-pp.dl_server_stop
6.11 ± 3% +16.0 22.10 ± 6% perf-profile.children.cycles-pp.hrtimer_try_to_cancel
11.36 +16.4 27.80 ± 3% perf-profile.children.cycles-pp.__x64_sys_clock_nanosleep
11.13 +16.5 27.63 ± 3% perf-profile.children.cycles-pp.common_nsleep
11.04 +16.5 27.58 ± 3% perf-profile.children.cycles-pp.hrtimer_nanosleep
10.90 +16.6 27.48 ± 3% perf-profile.children.cycles-pp.do_nanosleep
13.85 ± 2% +19.6 33.48 ± 7% perf-profile.children.cycles-pp.ttwu_do_activate
13.09 ± 2% +20.3 33.39 ± 7% perf-profile.children.cycles-pp.enqueue_task
13.56 +20.4 33.96 ± 7% perf-profile.children.cycles-pp.try_to_wake_up
12.92 ± 3% +20.4 33.35 ± 7% perf-profile.children.cycles-pp.enqueue_task_fair
10.20 ± 4% +21.4 31.56 ± 8% perf-profile.children.cycles-pp.enqueue_dl_entity
10.24 ± 4% +21.4 31.61 ± 8% perf-profile.children.cycles-pp.dl_server_start
11.02 ± 3% +21.5 32.51 ± 7% perf-profile.children.cycles-pp.hrtimer_start_range_ns
9.74 ± 4% +21.6 31.34 ± 8% perf-profile.children.cycles-pp.start_dl_timer
12.42 +24.0 36.41 ± 6% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
11.69 +24.1 35.78 ± 6% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
11.11 ± 2% +24.2 35.30 ± 6% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
10.91 ± 2% +24.2 35.13 ± 6% perf-profile.children.cycles-pp.hrtimer_interrupt
10.24 ± 2% +24.2 34.49 ± 6% perf-profile.children.cycles-pp.__hrtimer_run_queues
9.69 ± 2% +24.3 33.96 ± 7% perf-profile.children.cycles-pp.hrtimer_wakeup
11.90 ± 4% +27.9 39.77 ± 9% perf-profile.children.cycles-pp._raw_spin_lock_irqsave
13.14 ± 4% +38.1 51.29 ± 8% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
12.52 -7.5 4.99 ± 10% perf-profile.self.cycles-pp.intel_idle
3.10 -1.4 1.72 ± 9% perf-profile.self.cycles-pp._raw_spin_lock_irqsave
1.16 -1.0 0.15 ± 6% perf-profile.self.cycles-pp.available_idle_cpu
1.65 -0.9 0.74 ± 5% perf-profile.self.cycles-pp.__schedule
1.07 -0.9 0.20 ± 8% perf-profile.self.cycles-pp.switch_mm_irqs_off
1.44 -0.6 0.80 ± 9% perf-profile.self.cycles-pp.mutex_lock
0.87 -0.6 0.27 ± 9% perf-profile.self.cycles-pp.__switch_to
0.88 -0.6 0.28 ± 8% perf-profile.self.cycles-pp.update_load_avg
1.94 -0.6 1.36 ± 8% perf-profile.self.cycles-pp.syscall_exit_to_user_mode
0.64 -0.6 0.07 ± 9% perf-profile.self.cycles-pp.sched_mm_cid_migrate_to
0.85 ± 3% -0.5 0.31 ± 8% perf-profile.self.cycles-pp.update_cfs_group
0.75 -0.5 0.22 ± 6% perf-profile.self.cycles-pp.__switch_to_asm
1.51 -0.5 1.02 ± 8% perf-profile.self.cycles-pp._copy_from_user
0.72 -0.5 0.23 ± 6% perf-profile.self.cycles-pp.prepare_task_switch
0.58 -0.5 0.09 ± 7% perf-profile.self.cycles-pp.__wake_up_common
1.22 -0.5 0.74 ± 8% perf-profile.self.cycles-pp.read_tsc
0.77 -0.5 0.31 ± 9% perf-profile.self.cycles-pp.__pollwait
1.25 -0.5 0.79 ± 9% perf-profile.self.cycles-pp.stress_poll
1.29 ± 2% -0.5 0.83 ± 8% perf-profile.self.cycles-pp.pipe_read
1.26 -0.4 0.84 ± 9% perf-profile.self.cycles-pp._copy_to_iter
0.59 -0.4 0.18 ± 4% perf-profile.self.cycles-pp.finish_task_switch
0.62 -0.4 0.22 ± 9% perf-profile.self.cycles-pp.native_sched_clock
0.71 -0.4 0.34 ± 9% perf-profile.self.cycles-pp.pipe_poll
0.43 -0.4 0.08 ± 6% perf-profile.self.cycles-pp.try_to_wake_up
0.92 -0.4 0.57 ± 9% perf-profile.self.cycles-pp.pipe_write
1.53 -0.3 1.19 ± 8% perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
0.40 -0.3 0.10 ± 8% perf-profile.self.cycles-pp.restore_fpregs_from_fpstate
0.42 -0.3 0.14 ± 8% perf-profile.self.cycles-pp.menu_select
0.56 -0.3 0.29 ± 9% perf-profile.self.cycles-pp.do_sys_poll
0.63 -0.3 0.37 ± 9% perf-profile.self.cycles-pp.fdget
1.08 -0.2 0.83 ± 9% perf-profile.self.cycles-pp.read
0.38 -0.2 0.13 ± 7% perf-profile.self.cycles-pp.dequeue_entity
0.42 -0.2 0.17 ± 8% perf-profile.self.cycles-pp.__update_load_avg_se
0.92 -0.2 0.67 ± 8% perf-profile.self.cycles-pp.do_syscall_64
0.99 -0.2 0.74 ± 8% perf-profile.self.cycles-pp.write
0.28 -0.2 0.04 ± 45% perf-profile.self.cycles-pp.set_task_cpu
0.51 -0.2 0.28 ± 10% perf-profile.self.cycles-pp.hrtimer_active
0.46 -0.2 0.23 ± 7% perf-profile.self.cycles-pp.update_rq_clock
0.76 -0.2 0.57 ± 9% perf-profile.self.cycles-pp._copy_from_iter
0.32 ± 2% -0.2 0.13 ± 8% perf-profile.self.cycles-pp.update_curr
0.70 -0.2 0.52 ± 8% perf-profile.self.cycles-pp.entry_SYSCALL_64
0.28 -0.2 0.11 ± 8% perf-profile.self.cycles-pp.do_idle
0.34 -0.2 0.17 ± 7% perf-profile.self.cycles-pp.clock_nanosleep
0.60 -0.2 0.44 ± 9% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
0.91 -0.2 0.74 ± 9% perf-profile.self.cycles-pp.cpuidle_enter_state
0.22 ± 3% -0.2 0.06 ± 7% perf-profile.self.cycles-pp.pick_next_task_fair
0.60 -0.2 0.44 ± 9% perf-profile.self.cycles-pp.mutex_unlock
0.57 ± 2% -0.2 0.41 ± 9% perf-profile.self.cycles-pp.idle_cpu
0.25 ± 2% -0.2 0.10 ± 9% perf-profile.self.cycles-pp.__get_user_8
0.45 ± 2% -0.2 0.30 ± 9% perf-profile.self.cycles-pp.atime_needs_update
0.37 ± 2% -0.1 0.23 ± 11% perf-profile.self.cycles-pp.ktime_get
0.58 -0.1 0.43 ± 9% perf-profile.self.cycles-pp.vfs_read
0.24 -0.1 0.09 ± 7% perf-profile.self.cycles-pp.__wrgsbase_inactive
0.44 ± 2% -0.1 0.30 ± 9% perf-profile.self.cycles-pp.fdget_pos
0.46 -0.1 0.34 ± 9% perf-profile.self.cycles-pp.ppoll
0.21 -0.1 0.08 ± 10% perf-profile.self.cycles-pp.cpuidle_idle_call
0.23 ± 2% -0.1 0.10 ± 10% perf-profile.self.cycles-pp.sched_balance_newidle
0.19 ± 2% -0.1 0.06 ± 9% perf-profile.self.cycles-pp.___perf_sw_event
0.18 ± 2% -0.1 0.06 ± 9% perf-profile.self.cycles-pp.select_task_rq_fair
0.46 -0.1 0.33 ± 8% perf-profile.self.cycles-pp.syscall_return_via_sysret
0.21 -0.1 0.09 ± 12% perf-profile.self.cycles-pp.sleep
0.57 -0.1 0.46 ± 9% perf-profile.self.cycles-pp.do_select
0.46 -0.1 0.34 ± 8% perf-profile.self.cycles-pp.vfs_write
0.42 -0.1 0.31 ± 6% perf-profile.self.cycles-pp._raw_spin_lock_irq
0.16 ± 2% -0.1 0.06 ± 9% perf-profile.self.cycles-pp.switch_fpu_return
0.29 -0.1 0.18 ± 9% perf-profile.self.cycles-pp._raw_spin_unlock_irqrestore
0.19 -0.1 0.08 ± 8% perf-profile.self.cycles-pp.reweight_entity
0.20 -0.1 0.10 ± 13% perf-profile.self.cycles-pp.hrtimer_start_range_ns
0.13 ± 3% -0.1 0.03 ± 70% perf-profile.self.cycles-pp.__rdgsbase_inactive
0.29 -0.1 0.20 ± 10% perf-profile.self.cycles-pp.rw_verify_area
0.11 -0.1 0.02 ± 99% perf-profile.self.cycles-pp.__calc_delta
0.44 -0.1 0.35 ± 8% perf-profile.self.cycles-pp.current_time
0.40 -0.1 0.32 ± 8% perf-profile.self.cycles-pp.native_irq_return_iret
0.18 ± 2% -0.1 0.09 ± 9% perf-profile.self.cycles-pp.update_rq_clock_task
0.11 ± 4% -0.1 0.02 ± 99% perf-profile.self.cycles-pp.ttwu_do_activate
0.28 -0.1 0.20 ± 8% perf-profile.self.cycles-pp.__do_sys_prlimit64
0.10 ± 3% -0.1 0.02 ± 99% perf-profile.self.cycles-pp.__dequeue_entity
0.30 -0.1 0.23 ± 5% perf-profile.self.cycles-pp.enqueue_task_fair
0.31 -0.1 0.24 ± 9% perf-profile.self.cycles-pp._copy_to_user
0.39 -0.1 0.32 ± 9% perf-profile.self.cycles-pp.x64_sys_call
0.11 -0.1 0.04 ± 44% perf-profile.self.cycles-pp.update_min_vruntime
0.28 -0.1 0.21 ± 9% perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
0.29 ± 2% -0.1 0.22 ± 8% perf-profile.self.cycles-pp.ktime_get_ts64
0.28 -0.1 0.22 ± 9% perf-profile.self.cycles-pp.ktime_get_coarse_real_ts64_mg
0.23 -0.1 0.17 ± 7% perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
0.10 ± 3% -0.1 0.04 ± 45% perf-profile.self.cycles-pp.place_entity
0.13 ± 3% -0.1 0.07 ± 10% perf-profile.self.cycles-pp.__wake_up_sync_key
0.20 ± 2% -0.1 0.14 ± 9% perf-profile.self.cycles-pp.select_estimate_accuracy
0.47 -0.1 0.42 ± 9% perf-profile.self.cycles-pp.lapic_next_deadline
0.30 -0.1 0.25 ± 11% perf-profile.self.cycles-pp.core_sys_select
0.21 ± 3% -0.1 0.16 ± 8% perf-profile.self.cycles-pp.dequeue_entities
0.20 -0.1 0.15 ± 7% perf-profile.self.cycles-pp.native_apic_msr_eoi
0.15 -0.1 0.10 ± 10% perf-profile.self.cycles-pp.get_timespec64
0.24 -0.0 0.19 ± 7% perf-profile.self.cycles-pp.do_poll
0.21 -0.0 0.16 ± 9% perf-profile.self.cycles-pp.setrlimit64
0.21 -0.0 0.16 ± 7% perf-profile.self.cycles-pp.ksys_write
0.19 ± 2% -0.0 0.14 ± 10% perf-profile.self.cycles-pp.recalc_sigpending
0.23 ± 2% -0.0 0.18 ± 8% perf-profile.self.cycles-pp.ksys_read
0.18 -0.0 0.14 ± 10% perf-profile.self.cycles-pp.__cond_resched
0.17 ± 2% -0.0 0.13 ± 8% perf-profile.self.cycles-pp.copy_page_from_iter
0.15 -0.0 0.11 ± 9% perf-profile.self.cycles-pp.os_xsave
0.11 ± 6% -0.0 0.07 ± 7% perf-profile.self.cycles-pp.__pick_next_task
0.16 ± 2% -0.0 0.12 ± 9% perf-profile.self.cycles-pp.__select
0.21 -0.0 0.17 ± 9% perf-profile.self.cycles-pp.copy_page_to_iter
0.17 ± 4% -0.0 0.13 ± 8% perf-profile.self.cycles-pp.__enqueue_entity
0.06 -0.0 0.02 ± 99% perf-profile.self.cycles-pp.__put_user_8
0.06 -0.0 0.02 ± 99% perf-profile.self.cycles-pp.update_irq_load_avg
0.09 ± 5% -0.0 0.06 ± 9% perf-profile.self.cycles-pp.common_nsleep
0.16 ± 3% -0.0 0.12 ± 7% perf-profile.self.cycles-pp.nohz_run_idle_balance
0.12 ± 3% -0.0 0.09 ± 7% perf-profile.self.cycles-pp.inode_needs_update_time
0.16 -0.0 0.13 ± 8% perf-profile.self.cycles-pp.security_file_permission
0.15 ± 2% -0.0 0.12 ± 9% perf-profile.self.cycles-pp.fput
0.11 -0.0 0.08 ± 7% perf-profile.self.cycles-pp.vruntime_eligible
0.12 ± 3% -0.0 0.09 ± 9% perf-profile.self.cycles-pp.__nanosleep
0.12 -0.0 0.09 ± 9% perf-profile.self.cycles-pp.__x64_sys_clock_nanosleep
0.07 -0.0 0.04 ± 44% perf-profile.self.cycles-pp.__fdelt_warn
0.13 ± 2% -0.0 0.10 ± 11% perf-profile.self.cycles-pp.__poll
0.12 -0.0 0.09 ± 9% perf-profile.self.cycles-pp.poll_select_finish
0.11 ± 4% -0.0 0.08 ± 8% perf-profile.self.cycles-pp.set_user_sigmask
0.13 ± 2% -0.0 0.10 ± 9% perf-profile.self.cycles-pp.pselect
0.10 ± 4% -0.0 0.08 ± 13% perf-profile.self.cycles-pp.get_nohz_timer_target
0.11 ± 4% -0.0 0.08 ± 8% perf-profile.self.cycles-pp.__getrlimit
0.09 -0.0 0.07 ± 7% perf-profile.self.cycles-pp.do_prlimit
0.11 ± 4% -0.0 0.08 ± 8% perf-profile.self.cycles-pp.rcu_all_qs
0.09 -0.0 0.07 ± 7% perf-profile.self.cycles-pp.rseq_update_cpu_node_id
0.07 ± 5% -0.0 0.04 ± 45% perf-profile.self.cycles-pp.get_sigset_argpack
0.10 ± 3% -0.0 0.08 ± 8% perf-profile.self.cycles-pp.do_nanosleep
0.10 ± 4% -0.0 0.08 ± 7% perf-profile.self.cycles-pp.file_update_time
0.07 ± 7% -0.0 0.04 ± 45% perf-profile.self.cycles-pp.kill_fasync
0.12 ± 3% -0.0 0.10 ± 9% perf-profile.self.cycles-pp.__memset
0.10 ± 5% -0.0 0.08 ± 12% perf-profile.self.cycles-pp.start_dl_timer
0.08 ± 4% -0.0 0.06 ± 11% perf-profile.self.cycles-pp.task_mm_cid_work
0.06 -0.0 0.04 ± 44% perf-profile.self.cycles-pp.__hrtimer_run_queues
0.09 -0.0 0.07 ± 10% perf-profile.self.cycles-pp.__x64_sys_ppoll
0.13 ± 3% -0.0 0.11 ± 7% perf-profile.self.cycles-pp.avg_vruntime
0.08 -0.0 0.06 ± 11% perf-profile.self.cycles-pp.add_wait_queue
0.08 -0.0 0.06 ± 11% perf-profile.self.cycles-pp.hrtimer_nanosleep
0.08 -0.0 0.06 ± 7% perf-profile.self.cycles-pp.do_pselect
0.07 -0.0 0.06 ± 8% perf-profile.self.cycles-pp.touch_atime
0.07 ± 7% +0.0 0.09 ± 6% perf-profile.self.cycles-pp._find_next_and_bit
0.00 +0.1 0.11 ± 6% perf-profile.self.cycles-pp.sched_balance_find_src_rq
0.40 +0.1 0.52 ± 8% perf-profile.self.cycles-pp.rb_erase
0.00 +0.2 0.15 ± 9% perf-profile.self.cycles-pp.update_curr_dl_se
0.29 ± 30% +0.3 0.58 ± 11% perf-profile.self.cycles-pp.update_sg_lb_stats
13.13 ± 5% +38.1 51.28 ± 8% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [linux-next:master] [pipe_read] aaec5a95d5: stress-ng.poll.ops_per_sec 11.1% regression 2025-01-20 6:57 [linux-next:master] [pipe_read] aaec5a95d5: stress-ng.poll.ops_per_sec 11.1% regression kernel test robot @ 2025-01-20 11:27 ` Mateusz Guzik 2025-01-20 12:22 ` Oleg Nesterov 2025-01-20 15:50 ` Oleg Nesterov 1 sibling, 1 reply; 11+ messages in thread From: Mateusz Guzik @ 2025-01-20 11:27 UTC (permalink / raw) To: kernel test robot Cc: Oleg Nesterov, oe-lkp, lkp, Christian Brauner, WangYuli, linux-fsdevel On Mon, Jan 20, 2025 at 02:57:21PM +0800, kernel test robot wrote: > we reported > "[brauner-vfs:vfs-6.14.misc] [pipe_read] aaec5a95d5: hackbench.throughput 7.5% regression" > in > https://lore.kernel.org/all/202501101015.90874b3a-lkp@intel.com/ > but seems both you and Christian Brauner think it could be ignored. > > now we captured a regression in another test case. since e.g. there are > something like below in perf data, not sure if it could supply any useful > information? just FYI. sorry if it's still not with big value. > > > 9.45 -6.3 3.13 ± 9% perf-profile.calltrace.cycles-pp.pipe_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe > ... > 10.00 -6.5 3.53 ± 9% perf-profile.children.cycles-pp.pipe_read > 2.34 -1.3 1.07 ± 9% perf-profile.children.cycles-pp.pipe_poll > > Whatever the long term fate of the patch I think it would be prudent to skip it in this merge window. First two notes: 1. the change only considers performing a wake up if the current source buf got depleted -- if there is a blocked writer and there is at least one byte in the current buf nothing happens, which is where the difference in results is coming from 2. stress-ng is not really a microbenchmark suite. it is more of a "do stuff, some of which may accidentally line up with isolated behavior from real workloads and return ops/s". plenty of it does not line up with anythinig afaics (spoiler for tee improvement below). So, I had a look on a 24-way system and results are as follows. 1. tee (500% win) It performs tee/splice of a huge size along with 16 byte reads from another worker. On a kernel without the change this results in significant lock contention as the writer keeps being woken up, whereas with the change the bench gets to issue multiple reads without being bothered (as long as there is any data in the buf). For a real program this is more of a "don't do that" kind of deal imo, so I don't think this particular win translates to a real-world benefit. (modulo shite progs) 2. hackbench (7.5% loss) lkp folk roll with 128-way system and invoke: /usr/bin/hackbench -g 64 -f 20 --process --pipe -l 60000 -s 100 Oleg did not specify his spec, I'm guessing 4 cores * 2 threads given: hackbench -g 4 -f 10 --process --pipe -l 50000 -s 100 I presume the -g parameter got scaled down appropriately. So I ran this on 24 cores like so: hackbench -g 12 -f 20 --process --pipe -l 60000 -s 100 to match This is spawning a massive number of workers (480 in my case!) and there is tons of lock contention (go figure). I got a *massive* real time difference: 23.63s user 270.20s system 2312% cpu 12.71s (12.706) # without the patch 30.40s user 406.97s system 2293% cpu 19.07s (19.069) # with the patch According to perf there is a significant increase in time spent performing wake ups by the writer, while the reader is spending more time going off/on cpu in the read routine. I think this makes sense -- on a kernel without the patch you are more likely to get extra data before you drain the buffer. As for the specific difference in performance, per the above all this is massively contended and the wake ups are going to alter which locks are contended to what extent at different scales. I fully concede I'm not all confident hackbench is doing anything realistic, but I'll also note waking up a writer to get more data seems to make sense. To sum up, the win was registered for something which real programs should not be doing. Seeing the loss in other benches, I think it would be best to just drop the patch altogether. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [linux-next:master] [pipe_read] aaec5a95d5: stress-ng.poll.ops_per_sec 11.1% regression 2025-01-20 11:27 ` Mateusz Guzik @ 2025-01-20 12:22 ` Oleg Nesterov 2025-01-20 12:42 ` Oleg Nesterov 0 siblings, 1 reply; 11+ messages in thread From: Oleg Nesterov @ 2025-01-20 12:22 UTC (permalink / raw) To: Mateusz Guzik Cc: kernel test robot, oe-lkp, lkp, Christian Brauner, WangYuli, linux-fsdevel On 01/20, Mateusz Guzik wrote: > > Whatever the long term fate of the patch I think it would be prudent to > skip it in this merge window. Perhaps... I'll try to take another look tomorrow. Just one note right now. > First two notes: > 1. the change only considers performing a wake up if the current > source buf got depleted -- if there is a blocked writer and there is at > least one byte in the current buf nothing happens, which is where the > difference in results is coming from Sorry I don't understand. Unless this patch is buggy, pipe_read() must always wakeup a blocked writer if the writer can write at least one byte. The writer can't write to "current" buf = pipe->bufs[tail & mask] if pipe_full() is still true. Oleg. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [linux-next:master] [pipe_read] aaec5a95d5: stress-ng.poll.ops_per_sec 11.1% regression 2025-01-20 12:22 ` Oleg Nesterov @ 2025-01-20 12:42 ` Oleg Nesterov 2025-01-20 14:43 ` Oleg Nesterov 2025-01-20 16:56 ` Mateusz Guzik 0 siblings, 2 replies; 11+ messages in thread From: Oleg Nesterov @ 2025-01-20 12:42 UTC (permalink / raw) To: Mateusz Guzik Cc: kernel test robot, oe-lkp, lkp, Christian Brauner, WangYuli, linux-fsdevel Forgot to mention... On 01/20, Oleg Nesterov wrote: > > On 01/20, Mateusz Guzik wrote: > > > > Whatever the long term fate of the patch I think it would be prudent to > > skip it in this merge window. > > Perhaps... I'll try to take another look tomorrow. > > Just one note right now. > > > First two notes: > > 1. the change only considers performing a wake up if the current > > source buf got depleted -- if there is a blocked writer and there is at > > least one byte in the current buf nothing happens, which is where the > > difference in results is coming from > > Sorry I don't understand. Unless this patch is buggy, pipe_read() must > always wakeup a blocked writer if the writer can write at least one byte. > > The writer can't write to "current" buf = pipe->bufs[tail & mask] if > pipe_full() is still true. But I'll recheck this logic once again tomorrow, perhaps I misread pipe_write() when I made this patch. Oleg. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [linux-next:master] [pipe_read] aaec5a95d5: stress-ng.poll.ops_per_sec 11.1% regression 2025-01-20 12:42 ` Oleg Nesterov @ 2025-01-20 14:43 ` Oleg Nesterov 2025-01-20 16:56 ` Mateusz Guzik 1 sibling, 0 replies; 11+ messages in thread From: Oleg Nesterov @ 2025-01-20 14:43 UTC (permalink / raw) To: Mateusz Guzik Cc: kernel test robot, oe-lkp, lkp, Christian Brauner, WangYuli, linux-fsdevel On 01/20, Oleg Nesterov wrote: > > But I'll recheck this logic once again tomorrow, perhaps I misread > pipe_write() when I made this patch. Meanwhile I wrote a stupid test-case below. Without the patch State: S (sleeping) voluntary_ctxt_switches: 74 nonvoluntary_ctxt_switches: 5 State: S (sleeping) voluntary_ctxt_switches: 4169 nonvoluntary_ctxt_switches: 5 finally release the buffer wrote next char! With the patch State: S (sleeping) voluntary_ctxt_switches: 74 nonvoluntary_ctxt_switches: 3 State: S (sleeping) voluntary_ctxt_switches: 74 nonvoluntary_ctxt_switches: 3 finally release the buffer wrote next char! As you can see, without this patch pipe_read() wakes the writer up 4095 times for no reason, the writer burns a bit of CPU and blocks again after wakeup until the last read(fd[0], &c, 1). Oleg. ------------------------------------------------------------------------------- #include <stdlib.h> #include <unistd.h> #include <assert.h> #include <sys/ioctl.h> #include <stdio.h> #include <errno.h> int main(void) { int fd[2], nb, cnt; char cmd[1024], c; assert(pipe(fd) == 0); nb = 1; assert(ioctl(fd[1], FIONBIO, &nb) == 0); while (write(fd[1], &c, 1) == 1); assert(errno = -EAGAIN); nb = 0; assert(ioctl(fd[1], FIONBIO, &nb) == 0); // The pipe is full, the next write() will block. sprintf(cmd, "grep -e State -e ctxt_switches /proc/%d/status", getpid()); if (!fork()) { // wait until the parent sleeps in pipe_write() usleep(10000); system(cmd); // trigger 4095 unnecessary wakeups for (cnt = 0; cnt < 4095; ++cnt) { assert(read(fd[0], &c, 1) == 1); usleep(1000); } system(cmd); // this should actually wake the writer printf("finally release the buffer\n"); assert(read(fd[0], &c, 1) == 1); return 0; } assert(write(fd[1], &c, 1) == 1); printf("wrote next char!\n"); return 0; } ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [linux-next:master] [pipe_read] aaec5a95d5: stress-ng.poll.ops_per_sec 11.1% regression 2025-01-20 12:42 ` Oleg Nesterov 2025-01-20 14:43 ` Oleg Nesterov @ 2025-01-20 16:56 ` Mateusz Guzik 2025-01-20 20:31 ` Oleg Nesterov 1 sibling, 1 reply; 11+ messages in thread From: Mateusz Guzik @ 2025-01-20 16:56 UTC (permalink / raw) To: Oleg Nesterov Cc: kernel test robot, oe-lkp, lkp, Christian Brauner, WangYuli, linux-fsdevel On Mon, Jan 20, 2025 at 1:42 PM Oleg Nesterov <oleg@redhat.com> wrote: > > Forgot to mention... > > On 01/20, Oleg Nesterov wrote: > > > > On 01/20, Mateusz Guzik wrote: > > > > > > Whatever the long term fate of the patch I think it would be prudent to > > > skip it in this merge window. > > > > Perhaps... I'll try to take another look tomorrow. > > > > Just one note right now. > > > > > First two notes: > > > 1. the change only considers performing a wake up if the current > > > source buf got depleted -- if there is a blocked writer and there is at > > > least one byte in the current buf nothing happens, which is where the > > > difference in results is coming from > > > > Sorry I don't understand. Unless this patch is buggy, pipe_read() must > > always wakeup a blocked writer if the writer can write at least one byte. > > > > The writer can't write to "current" buf = pipe->bufs[tail & mask] if > > pipe_full() is still true. > > But I'll recheck this logic once again tomorrow, perhaps I misread > pipe_write() when I made this patch. > While I'm too tired to dig into the code at the momen, I did manage to grab an extra data point for hackerbench. Note on my setup (24-way) it takes way longer to execute with your patch. I checked how often the sucker goess off cpu, like so: bpftrace -e 'kprobe:schedule { @[kstack()] = count(); }' With your patch I reliably get about 38 mln calls from pipe_read. Without your patch this drops to about 17 mln, as in less than half. -- Mateusz Guzik <mjguzik gmail.com> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [linux-next:master] [pipe_read] aaec5a95d5: stress-ng.poll.ops_per_sec 11.1% regression 2025-01-20 16:56 ` Mateusz Guzik @ 2025-01-20 20:31 ` Oleg Nesterov 2025-01-20 21:15 ` Mateusz Guzik 0 siblings, 1 reply; 11+ messages in thread From: Oleg Nesterov @ 2025-01-20 20:31 UTC (permalink / raw) To: Mateusz Guzik Cc: kernel test robot, oe-lkp, lkp, Christian Brauner, WangYuli, linux-fsdevel Mateusz, I'm afraid my emails can look as if I am trying to deny the problem. No. Just I think we need to understand why exactly this patch makes a difference. On 01/20, Mateusz Guzik wrote: > > While I'm too tired to dig into the code at the momen, Me too. > I checked how often the sucker goess off cpu, like so: bpftrace -e > 'kprobe:schedule { @[kstack()] = count(); }' > > With your patch I reliably get about 38 mln calls from pipe_read. > Without your patch this drops to about 17 mln, as in less than half. Heh ;) I don't use bpftrace, but with the help of printk() I too noticed the difference (although not that big) when I tried to understand the 1st report https://lore.kernel.org/all/202501101015.90874b3a-lkp@intel.com/ Not that I really understand this difference, but I am not really surpised. With this patch the writers have more CPU (due to unnecessary wakeups). What really surprises me is that (with or without this patch) the readers call wait_event/schedule MUUUUUUUUUUUUCH more than the writers. I guess this is because sender() and receiver() are not "symmetric", sender() writes to the "random" fd, while receiver() always reads from the same ctx->in_fds[0]... Still not clear to me. And I don't understand what workload this logic tries to simulate, but this doesn't matter. Oleg. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [linux-next:master] [pipe_read] aaec5a95d5: stress-ng.poll.ops_per_sec 11.1% regression 2025-01-20 20:31 ` Oleg Nesterov @ 2025-01-20 21:15 ` Mateusz Guzik 2025-01-23 12:56 ` Oleg Nesterov 0 siblings, 1 reply; 11+ messages in thread From: Mateusz Guzik @ 2025-01-20 21:15 UTC (permalink / raw) To: Oleg Nesterov Cc: kernel test robot, oe-lkp, lkp, Christian Brauner, WangYuli, linux-fsdevel On Mon, Jan 20, 2025 at 9:31 PM Oleg Nesterov <oleg@redhat.com> wrote: > I'm afraid my emails can look as if I am trying to deny the problem. > No. Just I think we need to understand why exactly this patch makes > a difference. > I agree. I was going to state there is 0 urgency as long as the patch does not make the merge window, but it just did. Since the change does not introduce a bug or crater performance (that we know of anyway), I guess this outcome still means there is 0 urgency. ;) > And I don't understand what workload this logic tries to simulate, but > this doesn't matter. > So one would preferably survey a bunch of real workloads, see what happens with real pipes with both policies -- the early wake up is basically a tradeoff and it very well may be it is worth it in the real world. However, I would argue cycles needed to for such an effort would be best spent on other things. Per one of my previous messages the tee thing which got a significant win is doing some crap which should be avoided in real programs. The rest, with unknown real-world applicability, does suffer losses. The early wake up definitely has its own merits, so one can't say outright it was the right call to whack it. My suggestion to Christian is to revert the patch and call it a day. For all I know there are other yet to be reported regressions lurking (wins as well of course :>). By now there is no denying there is more to the patch than originally anticipated, but it is also doubtful it is worth poking around. If you feel nerd sniped to figure this out, then well, more power to you. :) Perhaps someone(tm) would be interested in looking at pipe performance in general. I can tell you right now that there is definitely loss stemming from repeated SMAP trips when changing buffers etc. Trying to get a real understanding what's up with pipes vs real workloads and fixing whatever crappers which pop up would justify the investigation. That said I'm buggering off this issue, cheers :) -- Mateusz Guzik <mjguzik gmail.com> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [linux-next:master] [pipe_read] aaec5a95d5: stress-ng.poll.ops_per_sec 11.1% regression 2025-01-20 21:15 ` Mateusz Guzik @ 2025-01-23 12:56 ` Oleg Nesterov 0 siblings, 0 replies; 11+ messages in thread From: Oleg Nesterov @ 2025-01-23 12:56 UTC (permalink / raw) To: Mateusz Guzik Cc: kernel test robot, oe-lkp, lkp, Christian Brauner, WangYuli, linux-fsdevel Sorry for delay, On 01/20, Mateusz Guzik wrote: > > On Mon, Jan 20, 2025 at 9:31 PM Oleg Nesterov <oleg@redhat.com> wrote: > > I'm afraid my emails can look as if I am trying to deny the problem. > > No. Just I think we need to understand why exactly this patch makes > > a difference. > > > > I agree. > > I was going to state there is 0 urgency as long as the patch does not > make the merge window, but it just did. Yes... > So one would preferably survey a bunch of real workloads, see what > happens with real pipes with both policies -- the early wake up is > basically a tradeoff and it very well may be it is worth it in the > real world. The problem is that this early wakeup is not intended, the code is not supposed to do this. So in some sense this patch fixes the intended/documented "avoid unnecessary wakeups" logic. Now I can reproduce the hackbench's slowdown on my laptop, but still don't understand it... I'll try to think more on Weekend, then I'll discuss the possible revert with Linus who wrote that code and reviewed this patch. Thanks for your investigations, Oleg. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [linux-next:master] [pipe_read] aaec5a95d5: stress-ng.poll.ops_per_sec 11.1% regression 2025-01-20 6:57 [linux-next:master] [pipe_read] aaec5a95d5: stress-ng.poll.ops_per_sec 11.1% regression kernel test robot 2025-01-20 11:27 ` Mateusz Guzik @ 2025-01-20 15:50 ` Oleg Nesterov 2025-01-22 8:43 ` Oliver Sang 1 sibling, 1 reply; 11+ messages in thread From: Oleg Nesterov @ 2025-01-20 15:50 UTC (permalink / raw) To: kernel test robot Cc: oe-lkp, lkp, Christian Brauner, WangYuli, linux-fsdevel, Mateusz Guzik Again, I'll try to take another look tomorrow. Not sure I will find the explanation though... But can you help? I know nothing about stress-ng. Google finds a lot of stress-ng repositories, I've clone the 1st one https://github.com/ColinIanKing/stress-ng/blob/master/stress-poll.c hopefully this is what you used. On 01/20, kernel test robot wrote: > > 9.45 -6.3 3.13 ± 9% perf-profile.calltrace.cycles-pp.pipe_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe > ... > 10.00 -6.5 3.53 ± 9% perf-profile.children.cycles-pp.pipe_read > 2.34 -1.3 1.07 ± 9% perf-profile.children.cycles-pp.pipe_poll Could you explain what do these numbers mean and how there are calculated? "git-grep cycles-pp" find nothing in stress-ng/ and tools/perf/ > kernel test robot noticed a 11.1% regression of stress-ng.poll.ops_per_sec on: same for ops_per_sec > 6150 -47.8% 3208 stress-ng.time.percent_of_cpu_this_job_got same for percent_of_cpu_this_job_got > 2993 -50.6% 1477 stress-ng.time.system_time > 711.20 -36.0% 454.85 stress-ng.time.user_time Is that what I think it is?? Does it run faster? Or it exits after some timeout and the decrease in system/user_time can be explained by the change in the mysterious 'percent_of_cpu_this_job_got' above? Oleg. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [linux-next:master] [pipe_read] aaec5a95d5: stress-ng.poll.ops_per_sec 11.1% regression 2025-01-20 15:50 ` Oleg Nesterov @ 2025-01-22 8:43 ` Oliver Sang 0 siblings, 0 replies; 11+ messages in thread From: Oliver Sang @ 2025-01-22 8:43 UTC (permalink / raw) To: Oleg Nesterov Cc: oe-lkp, lkp, Christian Brauner, WangYuli, linux-fsdevel, Mateusz Guzik, oliver.sang [-- Attachment #1: Type: text/plain, Size: 7098 bytes --] hi, Oleg, On Mon, Jan 20, 2025 at 04:50:10PM +0100, Oleg Nesterov wrote: > Again, I'll try to take another look tomorrow. Not sure I will find the > explanation though... > > But can you help? I know nothing about stress-ng. > > Google finds a lot of stress-ng repositories, I've clone the 1st one > https://github.com/ColinIanKing/stress-ng/blob/master/stress-poll.c > hopefully this is what you used. yes, this is the one we used. (you could see it in https://github.com/intel/lkp-tests/blob/master/programs/stress-ng/pkg/PKGBUILD) > > On 01/20, kernel test robot wrote: > > > > 9.45 -6.3 3.13 ± 9% perf-profile.calltrace.cycles-pp.pipe_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe > > ... > > 10.00 -6.5 3.53 ± 9% perf-profile.children.cycles-pp.pipe_read > > 2.34 -1.3 1.07 ± 9% perf-profile.children.cycles-pp.pipe_poll > > Could you explain what do these numbers mean and how there are calculated? > > "git-grep cycles-pp" find nothing in stress-ng/ and tools/perf/ we use perf. the perf-profile is a so-callded 'monitor' name. https://github.com/intel/lkp-tests/blob/master/monitors/no-stdout/perf-profile this link shows how this 'monitor' runs. I attached a perf-profile.gz in one run FYI. within it, you can see something like "-e cycles:pp" then for each run, all data will be parsed by https://github.com/intel/lkp-tests/blob/master/programs/perf-profile/parse for each commit, we run the test at least 6 times, then comparison list could give avg and @stddev like 2.34 -1.3 1.07 ± 9% perf-profile**** > > > kernel test robot noticed a 11.1% regression of stress-ng.poll.ops_per_sec on: > > same for ops_per_sec this the kpi for this stress-ng.poll test. the raw output looks like: 2025-01-17 00:02:55 stress-ng --timeout 60 --times --verify --metrics --no-rand-seed --poll 224 stress-ng: info: [6458] setting to a 1 min run per stressor stress-ng: info: [6458] dispatching hogs: 224 poll stress-ng: info: [6458] note: /proc/sys/kernel/sched_autogroup_enabled is 1 and this can impact scheduling throughput for processes not attached to a tty. Setting this to 0 may improve performance metrics stress-ng: metrc: [6458] stressor bogo ops real time usr time sys time bogo ops/s bogo ops/s CPU used per RSS Max stress-ng: metrc: [6458] (secs) (secs) (secs) (real time) (usr+sys time) instance (%) (KB) stress-ng: metrc: [6458] poll 776040080 60.00 451.84 1488.52 12933899.40 399947.20 14.44 1844 stress-ng: info: [6458] for a 60.08s run time: stress-ng: info: [6458] 13457.69s available CPU time stress-ng: info: [6458] 451.85s user time ( 3.36%) stress-ng: info: [6458] 1488.58s system time ( 11.06%) stress-ng: info: [6458] 1940.43s total time ( 14.42%) stress-ng: info: [6458] load average: 44.46 13.42 4.66 stress-ng: info: [6458] skipped: 0 stress-ng: info: [6458] passed: 224: poll (224) stress-ng: info: [6458] failed: 0 stress-ng: info: [6458] metrics untrustworthy: 0 stress-ng: info: [6458] successful run completed in 1 min we parse this data by https://github.com/intel/lkp-tests/blob/master/programs/stress-ng/parse then got the stress-ng.poll.ops_per_sec for this run from bogo ops/s (real time) -- 12933899.40 again, below data is the avg for multi-runs: 14516970 -11.1% 12907569 stress-ng.poll.ops_per_sec (no %stddev since we won't show it if it's < 3%, which means stable enough) > > > 6150 -47.8% 3208 stress-ng.time.percent_of_cpu_this_job_got > > same for percent_of_cpu_this_job_got > > > 2993 -50.6% 1477 stress-ng.time.system_time > > 711.20 -36.0% 454.85 stress-ng.time.user_time > > Is that what I think it is?? Does it run faster? these time data are got by https://github.com/intel/lkp-tests/blob/master/tests/wrapper#L38 below is a raw data FYI. Command being timed: "/lkp/lkp/src/programs/stress-ng/run" User time (seconds): 451.86 System time (seconds): 1488.79 Percent of CPU this job got: 3222% Elapsed (wall clock) time (h:mm:ss or m:ss): 1:00.21 <---- (1) Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 222324 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 87235 Voluntary context switches: 194087227 Involuntary context switches: 78364 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 > > Or it exits after some timeout and the decrease in system/user_time can be > explained by the change in the mysterious 'percent_of_cpu_this_job_got' above? above (1) seems expected, we run the test 60s. from the data you spot out, it seems stress-ng test itself get less cpu time to run, which may explain why it become slow. this reminds us maybe your commit about fs/pipe could have some impacts on our 'monitors'. so we rerun the tests by disabling all monitors. but still see similar regression. (below is the full list of compasison, as you can see, it's much shorter than in our original report, since now 'monitors' are disabled, so no monitor data.) ========================================================================================= compiler/cpufreq_governor/debug-setup/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: gcc-12/performance/no-monitor/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-spr-r02/poll/stress-ng/60s d2fc0ed52a284a13 aaec5a95d59615523db03dd53c2 ---------------- --------------------------- %stddev %change %stddev \ | \ 105111 ± 25% -58.9% 43248 ± 10% time.involuntary_context_switches 6036 -46.1% 3253 time.percent_of_cpu_this_job_got 2901 -48.7% 1489 time.system_time 734.52 -36.0% 469.77 time.user_time 4.472e+08 -55.5% 1.99e+08 time.voluntary_context_switches 8.817e+08 -9.8% 7.957e+08 stress-ng.poll.ops 14694617 -9.8% 13261019 stress-ng.poll.ops_per_sec 105111 ± 25% -58.9% 43248 ± 10% stress-ng.time.involuntary_context_switches 6036 -46.1% 3253 stress-ng.time.percent_of_cpu_this_job_got 2901 -48.7% 1489 stress-ng.time.system_time 734.52 -36.0% 469.77 stress-ng.time.user_time 4.472e+08 -55.5% 1.99e+08 stress-ng.time.voluntary_context_switches > > Oleg. > [-- Attachment #2: perf-profile.gz --] [-- Type: application/gzip, Size: 21745 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2025-01-23 12:56 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-01-20 6:57 [linux-next:master] [pipe_read] aaec5a95d5: stress-ng.poll.ops_per_sec 11.1% regression kernel test robot 2025-01-20 11:27 ` Mateusz Guzik 2025-01-20 12:22 ` Oleg Nesterov 2025-01-20 12:42 ` Oleg Nesterov 2025-01-20 14:43 ` Oleg Nesterov 2025-01-20 16:56 ` Mateusz Guzik 2025-01-20 20:31 ` Oleg Nesterov 2025-01-20 21:15 ` Mateusz Guzik 2025-01-23 12:56 ` Oleg Nesterov 2025-01-20 15:50 ` Oleg Nesterov 2025-01-22 8:43 ` Oliver Sang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox