From: kernel test robot <oliver.sang@intel.com>
To: <jason.zeng@intel.com>, <lin.x.wang@intel.com>, <pei.p.jia@intel.com>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>, <oliver.sang@intel.com>
Subject: [bytedance:6.6-velinux] [mm, pcp] 844cbbbcb6: netperf.Throughput_Mbps 87.1% regression
Date: Fri, 21 Feb 2025 16:24:27 +0800 [thread overview]
Message-ID: <202502211544.6fa1d77f-lkp@intel.com> (raw)
hi, all,
though it is mentioned "87.1% regression" in title, it could be an improvement
actually.
for the upstream version, we reported
"[linus:master] [mm, pcp] 362d37a106: netperf.Throughput_Mbps 14.5% improvement"
in
https://lore.kernel.org/all/202311141422.64f32250-oliver.sang@intel.com/
there is a note there:
"
when this commit is a review patch in
https://lore.kernel.org/all/20231016053002.756205-4-ying.huang@intel.com/
we made two reports
[1] 'a 14.6% improvement of netperf.Throughput_Mbps'
in https://lore.kernel.org/all/202310271441.71ce0a9-oliver.sang@intel.com/
now in mainline, we confirmed this commit cause similar performance change.
[2] 'a 60.4% regression of netperf.Throughput_Mbps'
in https://lore.kernel.org/all/202311061311.8d63998-oliver.sang@intel.com/
which per your education in
https://lore.kernel.org/all/87ttpzv11u.fsf@yhuang6-desk2.ccr.corp.intel.com/,
we know it's also an improvment in fact.
"
the test case for this report is also an UDP test. below is full report.
Hello,
kernel test robot noticed a 87.1% regression of netperf.Throughput_Mbps on:
commit: 844cbbbcb6322c37d4a8d814146ddee4020cb1d6 ("mm, pcp: reduce lock contention for draining high-order pages")
https://github.com/bytedance/kernel.git 6.6-velinux
testcase: netperf
config: x86_64-bytedance-6.6-velinux
compiler: gcc-12
test machine: 240 threads 1 sockets Genuine Intel(R) 0000 (Granite Rapids) with 192G memory
parameters:
ip: ipv4
runtime: 300s
nr_threads: 50%
cluster: cs-localhost
test: UDP_STREAM
cpufreq_governor: performance
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250221/202502211544.6fa1d77f-lkp@intel.com
=========================================================================================
cluster/compiler/cpufreq_governor/ip/kconfig/nr_threads/rootfs/runtime/tbox_group/test/testcase:
cs-localhost/gcc-12/performance/ipv4/x86_64-bytedance-6.6-velinux/50%/debian-12-x86_64-20240206.cgz/300s/lkp-gnr-1ap1/UDP_STREAM/netperf
commit:
3a1ca3b9e9 ("cacheinfo: calculate size of per-CPU data cache slice")
844cbbbcb6 ("mm, pcp: reduce lock contention for draining high-order pages")
3a1ca3b9e9076d07 844cbbbcb6322c37d4a8d814146
---------------- ---------------------------
%stddev %change %stddev
\ | \
1.767e+09 ± 10% +200.9% 5.317e+09 ± 12% cpuidle..time
516488 ± 38% +9922.3% 51763954 ± 13% cpuidle..usage
33570 ± 46% +1161.5% 423488 ± 7% vmstat.system.cs
313633 +22.7% 384750 vmstat.system.in
2.27 ± 10% +4.9 7.21 ± 11% mpstat.cpu.all.idle%
2.73 ± 6% -2.3 0.47 ± 5% mpstat.cpu.all.soft%
0.41 ± 3% -0.3 0.16 ± 5% mpstat.cpu.all.usr%
13.20 ±189% +1141.7% 163.90 ± 17% mpstat.max_utilization.seconds
1451037 ± 2% +37.2% 1990349 ± 4% meminfo.Active
1451037 ± 2% +37.2% 1990349 ± 4% meminfo.Active(anon)
4995978 +11.7% 5582719 meminfo.Cached
2467001 +23.9% 3056897 ± 2% meminfo.Committed_AS
156250 ± 6% +49.9% 234214 ± 7% meminfo.Mapped
1490174 ± 2% +39.4% 2076914 ± 3% meminfo.Shmem
54833 ± 3% -78.2% 11961 ± 6% netperf.ThroughputBoth_Mbps
6580063 ± 3% -78.2% 1435428 ± 6% netperf.ThroughputBoth_total_Mbps
1708 ± 4% +200.0% 5127 netperf.ThroughputRecv_Mbps
205075 ± 4% +200.0% 615291 netperf.ThroughputRecv_total_Mbps
53124 ± 3% -87.1% 6834 ± 12% netperf.Throughput_Mbps
6374987 ± 3% -87.1% 820137 ± 12% netperf.Throughput_total_Mbps
11790 +0.8% 11887 netperf.time.percent_of_cpu_this_job_got
257.34 ± 4% -85.7% 36.81 ± 10% netperf.time.user_time
3.767e+09 ± 3% -78.2% 8.217e+08 ± 6% netperf.workload
362938 ± 2% +37.0% 497108 ± 4% proc-vmstat.nr_active_anon
1249221 +11.7% 1395341 proc-vmstat.nr_file_pages
207024 +6.4% 220206 ± 2% proc-vmstat.nr_inactive_anon
39522 ± 6% +48.2% 58577 ± 7% proc-vmstat.nr_mapped
372770 ± 2% +39.2% 518890 ± 3% proc-vmstat.nr_shmem
362938 ± 2% +37.0% 497108 ± 4% proc-vmstat.nr_zone_active_anon
207024 +6.4% 220206 ± 2% proc-vmstat.nr_zone_inactive_anon
3.767e+09 ± 3% -78.2% 8.223e+08 ± 6% proc-vmstat.numa_hit
3.767e+09 ± 3% -78.2% 8.223e+08 ± 6% proc-vmstat.numa_local
215619 ± 9% +51.9% 327546 ± 7% proc-vmstat.pgactivate
3.01e+10 ± 3% -78.2% 6.56e+09 ± 6% proc-vmstat.pgalloc_normal
1112484 +13.1% 1257868 proc-vmstat.pgfault
3.01e+10 ± 3% -78.2% 6.559e+09 ± 6% proc-vmstat.pgfree
48331 -2.7% 47025 proc-vmstat.pgreuse
0.09 ± 41% -100.0% 0.00 sched_debug.cfs_rq:/.h_nr_running.min
0.01 ± 17% +212.1% 0.03 ± 6% sched_debug.cfs_rq:/.h_nr_running.stddev
399.18 ± 41% -100.0% 0.00 sched_debug.cfs_rq:/.load.min
0.42 ± 10% -67.0% 0.14 ± 30% sched_debug.cfs_rq:/.load_avg.min
0.09 ± 41% -100.0% 0.00 sched_debug.cfs_rq:/.nr_running.min
0.01 ± 33% +426.8% 0.03 ± 6% sched_debug.cfs_rq:/.nr_running.stddev
85.66 ± 41% -84.2% 13.56 ± 78% sched_debug.cfs_rq:/.runnable_avg.min
9.55 ± 12% +111.7% 20.21 ± 5% sched_debug.cfs_rq:/.runnable_avg.stddev
72.26 ± 41% -83.5% 11.94 ± 86% sched_debug.cfs_rq:/.util_avg.min
8.05 ± 20% +152.4% 20.32 ± 5% sched_debug.cfs_rq:/.util_avg.stddev
109878 ± 10% -62.6% 41047 ± 6% sched_debug.cpu.avg_idle.avg
45.57 ± 28% +375.7% 216.75 ± 5% sched_debug.cpu.curr->pid.stddev
0.01 ± 17% +213.9% 0.03 ± 5% sched_debug.cpu.nr_running.stddev
3084 ± 46% +1118.2% 37577 ± 8% sched_debug.cpu.nr_switches.avg
150.83 ± 11% +1573.7% 2524 ± 56% sched_debug.cpu.nr_switches.min
7153 ± 47% +163.0% 18813 ± 9% sched_debug.cpu.nr_switches.stddev
0.10 ±153% -100.0% 0.00 perf-sched.sch_delay.avg.ms.__cond_resched.down_write_killable.exec_mmap.begin_new_exec.load_elf_binary
0.15 ± 3% +162.9% 0.38 ± 16% perf-sched.sch_delay.avg.ms.__cond_resched.wait_for_completion.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
0.06 ±109% -81.6% 0.01 ± 49% perf-sched.sch_delay.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt
0.43 ± 41% -87.4% 0.05 ± 24% perf-sched.sch_delay.avg.ms.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll
0.09 ± 82% +328.7% 0.40 ± 50% perf-sched.sch_delay.avg.ms.schedule_hrtimeout_range_clock.usleep_range_state.asix_check_host_enable.__asix_mdio_read
0.02 ± 8% +313.7% 0.08 ± 13% perf-sched.sch_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
0.56 ±151% -100.0% 0.00 perf-sched.sch_delay.max.ms.__cond_resched.down_write_killable.exec_mmap.begin_new_exec.load_elf_binary
5.35 ± 49% -69.7% 1.62 ± 55% perf-sched.sch_delay.max.ms.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll
18.33 ±126% +448.0% 100.42 ± 18% perf-sched.sch_delay.max.ms.schedule_timeout.__skb_wait_for_more_packets.__skb_recv_udp.udp_recvmsg
0.28 ± 61% +596.9% 1.93 ± 42% perf-sched.sch_delay.max.ms.schedule_timeout.wait_for_completion_timeout.usb_start_wait_urb.usb_control_msg
50.51 ±137% -94.4% 2.84 ± 10% perf-sched.total_wait_and_delay.average.ms
115144 ± 52% +608.1% 815364 ± 10% perf-sched.total_wait_and_delay.count.ms
50.48 ±137% -94.4% 2.82 ± 10% perf-sched.total_wait_time.average.ms
18.53 ±134% +286.3% 71.58 ± 6% perf-sched.wait_and_delay.avg.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64
260.84 ± 45% -60.7% 102.56 ± 17% perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll
0.03 ± 50% +247.7% 0.11 ± 9% perf-sched.wait_and_delay.avg.ms.schedule_timeout.__skb_wait_for_more_packets.__skb_recv_udp.udp_recvmsg
57.80 ± 36% +122.8% 128.80 ± 17% perf-sched.wait_and_delay.count.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll
51447 ± 57% +1432.8% 788592 ± 10% perf-sched.wait_and_delay.count.schedule_timeout.__skb_wait_for_more_packets.__skb_recv_udp.udp_recvmsg
1829 ± 6% +23.3% 2256 ± 4% perf-sched.wait_and_delay.count.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
1237 -12.0% 1088 perf-sched.wait_and_delay.count.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
36.90 ±125% +362.7% 170.74 ± 16% perf-sched.wait_and_delay.max.ms.schedule_timeout.__skb_wait_for_more_packets.__skb_recv_udp.udp_recvmsg
18.53 ±134% +286.3% 71.58 ± 6% perf-sched.wait_time.avg.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64
2.10 ± 6% -34.6% 1.37 ± 9% perf-sched.wait_time.avg.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
260.41 ± 45% -60.6% 102.51 ± 17% perf-sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll
0.33 ± 18% +127.6% 0.74 ± 26% perf-sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.usleep_range_state.asix_check_host_enable.__asix_mdio_read
0.02 ± 50% +403.6% 0.10 ± 9% perf-sched.wait_time.avg.ms.schedule_timeout.__skb_wait_for_more_packets.__skb_recv_udp.udp_recvmsg
18.60 ±124% +390.3% 91.19 ± 15% perf-sched.wait_time.max.ms.schedule_timeout.__skb_wait_for_more_packets.__skb_recv_udp.udp_recvmsg
0.01 ± 4% +56.3% 0.02 ± 2% perf-stat.i.MPKI
2.589e+10 ± 3% -32.7% 1.743e+10 perf-stat.i.branch-instructions
0.07 ± 3% +0.1 0.12 perf-stat.i.branch-miss-rate%
13640490 ± 3% +36.5% 18613531 perf-stat.i.branch-misses
0.14 ± 5% -0.0 0.11 ± 3% perf-stat.i.cache-miss-rate%
615296 ± 2% +38.7% 853112 perf-stat.i.cache-misses
9.294e+08 ± 9% +215.2% 2.93e+09 ± 2% perf-stat.i.cache-references
32970 ± 46% +1189.0% 424977 ± 7% perf-stat.i.context-switches
6.02 ± 3% +65.7% 9.97 perf-stat.i.cpi
7.978e+11 -4.8% 7.596e+11 perf-stat.i.cpu-cycles
256.89 +955.7% 2712 ± 23% perf-stat.i.cpu-migrations
27088580 ± 23% -58.5% 11243498 ± 2% perf-stat.i.cycles-between-cache-misses
1.316e+11 ± 3% -42.4% 7.576e+10 perf-stat.i.instructions
0.17 ± 3% -38.9% 0.11 perf-stat.i.ipc
0.02 ± 32% +346.7% 0.07 ± 41% perf-stat.i.major-faults
2.28 -22.0% 1.78 ± 7% perf-stat.i.metric.K/sec
3275 +20.5% 3947 perf-stat.i.minor-faults
3275 +20.5% 3947 perf-stat.i.page-faults
0.00 ± 4% +139.0% 0.01 ± 2% perf-stat.overall.MPKI
0.05 ± 2% +0.1 0.11 perf-stat.overall.branch-miss-rate%
0.07 ± 10% -0.0 0.03 ± 2% perf-stat.overall.cache-miss-rate%
6.07 ± 3% +65.1% 10.02 perf-stat.overall.cpi
1229111 -30.9% 849126 perf-stat.overall.cycles-between-cache-misses
0.17 ± 3% -39.5% 0.10 perf-stat.overall.ipc
10676 +162.5% 28031 ± 5% perf-stat.overall.path-length
2.583e+10 ± 3% -32.6% 1.74e+10 perf-stat.ps.branch-instructions
13627561 ± 3% +36.0% 18539008 perf-stat.ps.branch-misses
647569 +37.8% 892562 perf-stat.ps.cache-misses
9.269e+08 ± 9% +214.9% 2.919e+09 ± 2% perf-stat.ps.cache-references
33672 ± 46% +1164.1% 425654 ± 7% perf-stat.ps.context-switches
7.958e+11 -4.8% 7.578e+11 perf-stat.ps.cpu-cycles
254.85 +971.0% 2729 ± 23% perf-stat.ps.cpu-migrations
1.313e+11 ± 3% -42.4% 7.564e+10 perf-stat.ps.instructions
0.02 ± 32% +355.9% 0.07 ± 42% perf-stat.ps.major-faults
3220 +16.1% 3737 perf-stat.ps.minor-faults
3220 +16.1% 3737 perf-stat.ps.page-faults
4.021e+13 ± 3% -42.9% 2.296e+13 perf-stat.total.instructions
24.27 ± 3% -24.3 0.00 perf-profile.calltrace.cycles-pp.__ip_make_skb.ip_make_skb.udp_sendmsg.__sys_sendto.__x64_sys_sendto
24.12 ± 3% -24.1 0.00 perf-profile.calltrace.cycles-pp.__ip_select_ident.__ip_make_skb.ip_make_skb.udp_sendmsg.__sys_sendto
48.45 -6.5 41.97 perf-profile.calltrace.cycles-pp.__raw_spin_lock_irqsave.free_pcppages_bulk.free_unref_page.skb_release_data.__consume_stateless_skb
48.52 -6.4 42.11 perf-profile.calltrace.cycles-pp.free_pcppages_bulk.free_unref_page.skb_release_data.__consume_stateless_skb.udp_recvmsg
48.34 -6.4 41.94 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath.__raw_spin_lock_irqsave.free_pcppages_bulk.free_unref_page.skb_release_data
48.84 -6.4 42.47 perf-profile.calltrace.cycles-pp.skb_release_data.__consume_stateless_skb.udp_recvmsg.inet_recvmsg.sock_recvmsg
48.84 -6.4 42.47 perf-profile.calltrace.cycles-pp.__consume_stateless_skb.udp_recvmsg.inet_recvmsg.sock_recvmsg.__sys_recvfrom
48.66 -6.3 42.41 perf-profile.calltrace.cycles-pp.free_unref_page.skb_release_data.__consume_stateless_skb.udp_recvmsg.inet_recvmsg
8.79 ± 5% -5.9 2.88 ± 3% perf-profile.calltrace.cycles-pp.ip_generic_getfrag.__ip_append_data.ip_make_skb.udp_sendmsg.__sys_sendto
8.62 ± 5% -5.8 2.86 ± 3% perf-profile.calltrace.cycles-pp._copy_from_iter.ip_generic_getfrag.__ip_append_data.ip_make_skb.udp_sendmsg
8.33 ± 5% -5.6 2.77 ± 3% perf-profile.calltrace.cycles-pp.copyin._copy_from_iter.ip_generic_getfrag.__ip_append_data.ip_make_skb
5.04 ± 7% -4.4 0.61 ± 5% perf-profile.calltrace.cycles-pp.udp_send_skb.udp_sendmsg.__sys_sendto.__x64_sys_sendto.do_syscall_64
4.92 ± 8% -4.3 0.60 ± 5% perf-profile.calltrace.cycles-pp.ip_send_skb.udp_send_skb.udp_sendmsg.__sys_sendto.__x64_sys_sendto
4.59 ± 11% -4.0 0.58 ± 5% perf-profile.calltrace.cycles-pp.ip_finish_output2.ip_send_skb.udp_send_skb.udp_sendmsg.__sys_sendto
4.51 ± 11% -3.9 0.57 ± 5% perf-profile.calltrace.cycles-pp.__dev_queue_xmit.ip_finish_output2.ip_send_skb.udp_send_skb.udp_sendmsg
49.61 -3.4 46.17 perf-profile.calltrace.cycles-pp.udp_recvmsg.inet_recvmsg.sock_recvmsg.__sys_recvfrom.__x64_sys_recvfrom
49.62 -3.4 46.17 perf-profile.calltrace.cycles-pp.inet_recvmsg.sock_recvmsg.__sys_recvfrom.__x64_sys_recvfrom.do_syscall_64
49.62 -3.4 46.18 perf-profile.calltrace.cycles-pp.sock_recvmsg.__sys_recvfrom.__x64_sys_recvfrom.do_syscall_64.entry_SYSCALL_64_after_hwframe
49.64 -3.4 46.24 perf-profile.calltrace.cycles-pp.__sys_recvfrom.__x64_sys_recvfrom.do_syscall_64.entry_SYSCALL_64_after_hwframe.recvfrom
49.64 -3.4 46.24 perf-profile.calltrace.cycles-pp.__x64_sys_recvfrom.do_syscall_64.entry_SYSCALL_64_after_hwframe.recvfrom.recv_omni
49.65 -3.4 46.27 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.recvfrom.recv_omni.process_requests
49.66 -3.4 46.28 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.recvfrom.recv_omni.process_requests.spawn_child
49.66 -3.4 46.31 perf-profile.calltrace.cycles-pp.recvfrom.recv_omni.process_requests.spawn_child.accept_connection
49.68 -3.3 46.34 perf-profile.calltrace.cycles-pp.accept_connection.accept_connections.main
49.68 -3.3 46.34 perf-profile.calltrace.cycles-pp.accept_connections.main
49.68 -3.3 46.34 perf-profile.calltrace.cycles-pp.process_requests.spawn_child.accept_connection.accept_connections.main
49.68 -3.3 46.34 perf-profile.calltrace.cycles-pp.spawn_child.accept_connection.accept_connections.main
49.68 -3.3 46.34 perf-profile.calltrace.cycles-pp.recv_omni.process_requests.spawn_child.accept_connection.accept_connections
2.86 ± 5% -2.5 0.32 ± 81% perf-profile.calltrace.cycles-pp.do_softirq.__local_bh_enable_ip.__dev_queue_xmit.ip_finish_output2.ip_send_skb
2.87 ± 5% -2.5 0.37 ± 65% perf-profile.calltrace.cycles-pp.__local_bh_enable_ip.__dev_queue_xmit.ip_finish_output2.ip_send_skb.udp_send_skb
99.61 -0.4 99.17 perf-profile.calltrace.cycles-pp.main
0.00 +0.8 0.76 ± 16% perf-profile.calltrace.cycles-pp.__schedule.schedule.schedule_timeout.__skb_wait_for_more_packets.__skb_recv_udp
0.00 +0.8 0.76 ± 16% perf-profile.calltrace.cycles-pp.schedule.schedule_timeout.__skb_wait_for_more_packets.__skb_recv_udp.udp_recvmsg
0.00 +0.8 0.76 ± 15% perf-profile.calltrace.cycles-pp.schedule_timeout.__skb_wait_for_more_packets.__skb_recv_udp.udp_recvmsg.inet_recvmsg
0.00 +0.8 0.78 ± 15% perf-profile.calltrace.cycles-pp.__skb_wait_for_more_packets.__skb_recv_udp.udp_recvmsg.inet_recvmsg.sock_recvmsg
0.00 +1.0 0.96 ± 12% perf-profile.calltrace.cycles-pp.__skb_recv_udp.udp_recvmsg.inet_recvmsg.sock_recvmsg.__sys_recvfrom
0.61 ± 9% +1.8 2.41 ± 2% perf-profile.calltrace.cycles-pp.copyout._copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.udp_recvmsg
0.62 ± 9% +1.8 2.43 ± 2% perf-profile.calltrace.cycles-pp._copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.udp_recvmsg.inet_recvmsg
0.67 ± 8% +2.0 2.63 ± 2% perf-profile.calltrace.cycles-pp.__skb_datagram_iter.skb_copy_datagram_iter.udp_recvmsg.inet_recvmsg.sock_recvmsg
0.67 ± 8% +2.0 2.64 ± 2% perf-profile.calltrace.cycles-pp.skb_copy_datagram_iter.udp_recvmsg.inet_recvmsg.sock_recvmsg.__sys_recvfrom
49.93 +2.9 52.83 perf-profile.calltrace.cycles-pp.send_udp_stream.main
49.92 +2.9 52.83 perf-profile.calltrace.cycles-pp.send_omni_inner.send_udp_stream.main
49.62 +3.2 52.79 perf-profile.calltrace.cycles-pp.sendto.send_omni_inner.send_udp_stream.main
49.50 +3.3 52.77 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.sendto.send_omni_inner.send_udp_stream.main
49.46 +3.3 52.76 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.sendto.send_omni_inner.send_udp_stream
49.36 +3.4 52.74 perf-profile.calltrace.cycles-pp.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe.sendto.send_omni_inner
49.34 +3.4 52.74 perf-profile.calltrace.cycles-pp.__sys_sendto.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe.sendto
48.94 +3.7 52.68 perf-profile.calltrace.cycles-pp.udp_sendmsg.__sys_sendto.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe
43.24 +8.7 51.98 perf-profile.calltrace.cycles-pp.ip_make_skb.udp_sendmsg.__sys_sendto.__x64_sys_sendto.do_syscall_64
18.89 ± 3% +32.9 51.82 perf-profile.calltrace.cycles-pp.__ip_append_data.ip_make_skb.udp_sendmsg.__sys_sendto.__x64_sys_sendto
9.11 +39.6 48.74 perf-profile.calltrace.cycles-pp.sk_page_frag_refill.__ip_append_data.ip_make_skb.udp_sendmsg.__sys_sendto
9.10 +39.6 48.74 perf-profile.calltrace.cycles-pp.skb_page_frag_refill.sk_page_frag_refill.__ip_append_data.ip_make_skb.udp_sendmsg
8.91 +39.8 48.69 perf-profile.calltrace.cycles-pp.__alloc_pages.skb_page_frag_refill.sk_page_frag_refill.__ip_append_data.ip_make_skb
8.81 +39.8 48.66 perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_pages.skb_page_frag_refill.sk_page_frag_refill.__ip_append_data
8.00 +40.4 48.44 perf-profile.calltrace.cycles-pp.rmqueue.get_page_from_freelist.__alloc_pages.skb_page_frag_refill.sk_page_frag_refill
6.78 +41.2 47.98 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath.__raw_spin_lock_irqsave.rmqueue_bulk.rmqueue.get_page_from_freelist
6.80 +41.2 48.01 perf-profile.calltrace.cycles-pp.__raw_spin_lock_irqsave.rmqueue_bulk.rmqueue.get_page_from_freelist.__alloc_pages
6.89 +41.4 48.32 perf-profile.calltrace.cycles-pp.rmqueue_bulk.rmqueue.get_page_from_freelist.__alloc_pages.skb_page_frag_refill
24.28 ± 3% -24.1 0.15 ± 16% perf-profile.children.cycles-pp.__ip_make_skb
24.14 ± 3% -24.0 0.12 ± 16% perf-profile.children.cycles-pp.__ip_select_ident
50.61 -8.1 42.51 perf-profile.children.cycles-pp.skb_release_data
50.19 -7.8 42.44 perf-profile.children.cycles-pp.free_unref_page
48.53 -6.4 42.11 perf-profile.children.cycles-pp.free_pcppages_bulk
48.84 -6.4 42.47 perf-profile.children.cycles-pp.__consume_stateless_skb
8.81 ± 5% -5.9 2.89 ± 3% perf-profile.children.cycles-pp.ip_generic_getfrag
8.63 ± 5% -5.8 2.86 ± 3% perf-profile.children.cycles-pp._copy_from_iter
8.52 ± 5% -5.7 2.86 ± 3% perf-profile.children.cycles-pp.copyin
5.05 ± 7% -4.4 0.61 ± 5% perf-profile.children.cycles-pp.udp_send_skb
4.92 ± 8% -4.3 0.60 ± 5% perf-profile.children.cycles-pp.ip_send_skb
4.60 ± 11% -4.0 0.58 ± 5% perf-profile.children.cycles-pp.ip_finish_output2
4.52 ± 11% -3.9 0.57 ± 5% perf-profile.children.cycles-pp.__dev_queue_xmit
49.61 -3.4 46.17 perf-profile.children.cycles-pp.udp_recvmsg
49.62 -3.4 46.18 perf-profile.children.cycles-pp.sock_recvmsg
49.62 -3.4 46.17 perf-profile.children.cycles-pp.inet_recvmsg
49.64 -3.4 46.24 perf-profile.children.cycles-pp.__x64_sys_recvfrom
49.64 -3.4 46.24 perf-profile.children.cycles-pp.__sys_recvfrom
49.67 -3.3 46.32 perf-profile.children.cycles-pp.recvfrom
49.68 -3.3 46.34 perf-profile.children.cycles-pp.accept_connection
49.68 -3.3 46.34 perf-profile.children.cycles-pp.accept_connections
49.68 -3.3 46.34 perf-profile.children.cycles-pp.process_requests
49.68 -3.3 46.34 perf-profile.children.cycles-pp.recv_omni
49.68 -3.3 46.34 perf-profile.children.cycles-pp.spawn_child
2.88 ± 5% -2.4 0.52 ± 5% perf-profile.children.cycles-pp.__local_bh_enable_ip
2.87 ± 5% -2.4 0.51 ± 5% perf-profile.children.cycles-pp.do_softirq
2.85 ± 5% -2.3 0.54 ± 5% perf-profile.children.cycles-pp.handle_softirqs
2.79 ± 5% -2.3 0.50 ± 4% perf-profile.children.cycles-pp.net_rx_action
2.73 ± 5% -2.2 0.49 ± 4% perf-profile.children.cycles-pp.__napi_poll
2.72 ± 5% -2.2 0.49 ± 4% perf-profile.children.cycles-pp.process_backlog
2.59 ± 5% -2.1 0.47 ± 4% perf-profile.children.cycles-pp.__netif_receive_skb_one_core
2.39 ± 5% -1.9 0.44 ± 4% perf-profile.children.cycles-pp.ip_local_deliver_finish
2.38 ± 5% -1.9 0.44 ± 4% perf-profile.children.cycles-pp.ip_protocol_deliver_rcu
2.32 ± 5% -1.9 0.43 ± 4% perf-profile.children.cycles-pp.__udp4_lib_rcv
2.12 ± 5% -1.7 0.40 ± 3% perf-profile.children.cycles-pp.udp_unicast_rcv_skb
2.10 ± 5% -1.7 0.39 ± 3% perf-profile.children.cycles-pp.udp_queue_rcv_one_skb
1.24 ± 6% -1.1 0.10 ± 11% perf-profile.children.cycles-pp.free_unref_page_commit
99.98 -0.4 99.55 perf-profile.children.cycles-pp.main
0.50 ± 4% -0.4 0.07 ± 12% perf-profile.children.cycles-pp.ip_route_output_flow
0.47 ± 6% -0.4 0.11 ± 8% perf-profile.children.cycles-pp.__zone_watermark_ok
0.38 ± 5% -0.3 0.08 ± 10% perf-profile.children.cycles-pp.sock_alloc_send_pskb
0.34 ± 5% -0.3 0.05 ± 34% perf-profile.children.cycles-pp.ip_route_output_key_hash_rcu
0.30 ± 6% -0.2 0.06 ± 7% perf-profile.children.cycles-pp.alloc_skb_with_frags
0.29 ± 6% -0.2 0.06 ± 7% perf-profile.children.cycles-pp.__alloc_skb
0.18 ± 4% -0.1 0.07 ± 5% perf-profile.children.cycles-pp.free_unref_page_prepare
0.21 ± 4% -0.0 0.17 ± 2% perf-profile.children.cycles-pp.__check_object_size
0.12 ± 5% +0.0 0.15 ± 3% perf-profile.children.cycles-pp.check_heap_object
0.22 ± 5% +0.0 0.25 ± 2% perf-profile.children.cycles-pp.vfs_write
0.21 ± 4% +0.0 0.24 ± 3% perf-profile.children.cycles-pp.shmem_file_write_iter
0.20 ± 4% +0.0 0.23 ± 3% perf-profile.children.cycles-pp.generic_perform_write
0.00 +0.1 0.05 ± 5% perf-profile.children.cycles-pp.sysvec_call_function_single
0.06 ± 5% +0.1 0.11 perf-profile.children.cycles-pp.__free_one_page
0.00 +0.1 0.05 ± 7% perf-profile.children.cycles-pp.activate_task
0.05 +0.1 0.10 ± 8% perf-profile.children.cycles-pp.scheduler_tick
0.00 +0.1 0.05 ± 8% perf-profile.children.cycles-pp.ttwu_do_activate
0.08 ± 7% +0.1 0.14 ± 2% perf-profile.children.cycles-pp.shmem_get_folio_gfp
0.08 ± 7% +0.1 0.14 perf-profile.children.cycles-pp.shmem_write_begin
0.00 +0.1 0.06 ± 8% perf-profile.children.cycles-pp.update_load_avg
0.00 +0.1 0.06 ± 13% perf-profile.children.cycles-pp.sched_ttwu_pending
0.00 +0.1 0.06 perf-profile.children.cycles-pp.asm_sysvec_call_function_single
0.05 +0.1 0.11 perf-profile.children.cycles-pp.__folio_alloc
0.05 +0.1 0.11 perf-profile.children.cycles-pp.shmem_alloc_and_acct_folio
0.05 +0.1 0.11 perf-profile.children.cycles-pp.shmem_alloc_folio
0.05 +0.1 0.11 perf-profile.children.cycles-pp.vma_alloc_folio
0.00 +0.1 0.06 ± 7% perf-profile.children.cycles-pp.task_tick_fair
0.00 +0.1 0.07 ± 10% perf-profile.children.cycles-pp._raw_spin_lock
0.00 +0.1 0.07 ± 18% perf-profile.children.cycles-pp.intel_idle
0.00 +0.1 0.07 ± 5% perf-profile.children.cycles-pp.try_charge_memcg
0.00 +0.1 0.07 ± 4% perf-profile.children.cycles-pp.__sk_mem_reduce_allocated
0.00 +0.1 0.07 ± 12% perf-profile.children.cycles-pp.__flush_smp_call_function_queue
0.11 ± 8% +0.1 0.20 ± 3% perf-profile.children.cycles-pp.hrtimer_interrupt
0.12 ± 5% +0.1 0.20 ± 3% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
0.18 ± 4% +0.1 0.27 ± 2% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
0.00 +0.1 0.09 ± 3% perf-profile.children.cycles-pp.mem_cgroup_charge_skmem
0.00 +0.1 0.09 ± 6% perf-profile.children.cycles-pp.udp_rmem_release
0.08 ± 3% +0.1 0.17 ± 3% perf-profile.children.cycles-pp.tick_sched_timer
0.08 ± 4% +0.1 0.18 ± 4% perf-profile.children.cycles-pp.__hrtimer_run_queues
0.07 +0.1 0.17 ± 3% perf-profile.children.cycles-pp.update_process_times
0.07 +0.1 0.17 ± 4% perf-profile.children.cycles-pp.tick_sched_handle
0.12 ± 8% +0.1 0.22 ± 3% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
0.00 +0.1 0.10 ± 17% perf-profile.children.cycles-pp.cpu_util
0.12 ± 3% +0.1 0.24 ± 2% perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore
0.00 +0.1 0.12 ± 3% perf-profile.children.cycles-pp.__sk_mem_raise_allocated
0.00 +0.1 0.12 ± 3% perf-profile.children.cycles-pp.__sk_mem_schedule
0.00 +0.1 0.13 ± 9% perf-profile.children.cycles-pp.autoremove_wake_function
0.00 +0.1 0.13 ± 9% perf-profile.children.cycles-pp.try_to_wake_up
0.00 +0.1 0.14 ± 8% perf-profile.children.cycles-pp.__wake_up_common
0.00 +0.1 0.14 ± 8% perf-profile.children.cycles-pp.__wake_up_common_lock
0.00 +0.1 0.14 ± 8% perf-profile.children.cycles-pp.schedule_idle
0.00 +0.1 0.15 ± 3% perf-profile.children.cycles-pp.simple_copy_to_iter
0.00 +0.2 0.15 ± 6% perf-profile.children.cycles-pp.sock_def_readable
0.00 +0.2 0.18 ± 44% perf-profile.children.cycles-pp.cpuidle_enter
0.00 +0.2 0.18 ± 44% perf-profile.children.cycles-pp.cpuidle_enter_state
0.09 ± 4% +0.2 0.29 ± 4% perf-profile.children.cycles-pp.__udp_enqueue_schedule_skb
0.00 +0.2 0.21 ± 39% perf-profile.children.cycles-pp.cpuidle_idle_call
0.00 +0.4 0.42 ± 20% perf-profile.children.cycles-pp.start_secondary
0.00 +0.4 0.43 ± 20% perf-profile.children.cycles-pp.cpu_startup_entry
0.00 +0.4 0.43 ± 20% perf-profile.children.cycles-pp.do_idle
0.00 +0.4 0.43 ± 20% perf-profile.children.cycles-pp.secondary_startup_64_no_verify
0.00 +0.6 0.62 ± 18% perf-profile.children.cycles-pp.update_sg_lb_stats
0.00 +0.7 0.66 ± 18% perf-profile.children.cycles-pp.update_sd_lb_stats
0.00 +0.7 0.67 ± 18% perf-profile.children.cycles-pp.find_busiest_group
0.00 +0.7 0.69 ± 18% perf-profile.children.cycles-pp.load_balance
0.00 +0.7 0.71 ± 18% perf-profile.children.cycles-pp.newidle_balance
0.00 +0.7 0.72 ± 18% perf-profile.children.cycles-pp.pick_next_task_fair
0.00 +0.8 0.76 ± 15% perf-profile.children.cycles-pp.schedule
0.00 +0.8 0.76 ± 15% perf-profile.children.cycles-pp.schedule_timeout
0.00 +0.8 0.78 ± 15% perf-profile.children.cycles-pp.__skb_wait_for_more_packets
0.07 ± 7% +0.9 0.96 ± 12% perf-profile.children.cycles-pp.__skb_recv_udp
0.00 +0.9 0.90 ± 14% perf-profile.children.cycles-pp.__schedule
0.61 ± 9% +1.8 2.42 ± 2% perf-profile.children.cycles-pp.copyout
0.62 ± 9% +1.8 2.43 ± 2% perf-profile.children.cycles-pp._copy_to_iter
0.67 ± 8% +2.0 2.63 ± 2% perf-profile.children.cycles-pp.__skb_datagram_iter
0.67 ± 8% +2.0 2.64 ± 2% perf-profile.children.cycles-pp.skb_copy_datagram_iter
49.93 +2.9 52.83 perf-profile.children.cycles-pp.send_udp_stream
49.92 +2.9 52.83 perf-profile.children.cycles-pp.send_omni_inner
49.75 +3.1 52.81 perf-profile.children.cycles-pp.sendto
49.36 +3.4 52.74 perf-profile.children.cycles-pp.__x64_sys_sendto
49.36 +3.4 52.74 perf-profile.children.cycles-pp.__sys_sendto
48.95 +3.7 52.68 perf-profile.children.cycles-pp.udp_sendmsg
43.25 +8.7 51.98 perf-profile.children.cycles-pp.ip_make_skb
18.90 ± 3% +32.9 51.82 perf-profile.children.cycles-pp.__ip_append_data
55.34 +34.8 90.11 perf-profile.children.cycles-pp.__raw_spin_lock_irqsave
55.16 +34.9 90.04 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
9.12 +39.6 48.74 perf-profile.children.cycles-pp.sk_page_frag_refill
9.11 +39.6 48.74 perf-profile.children.cycles-pp.skb_page_frag_refill
8.96 +39.8 48.80 perf-profile.children.cycles-pp.__alloc_pages
8.87 +39.9 48.77 perf-profile.children.cycles-pp.get_page_from_freelist
8.05 +40.5 48.55 perf-profile.children.cycles-pp.rmqueue
6.93 +41.5 48.43 perf-profile.children.cycles-pp.rmqueue_bulk
24.03 ± 3% -23.9 0.11 ± 16% perf-profile.self.cycles-pp.__ip_select_ident
8.50 ± 5% -5.7 2.85 ± 3% perf-profile.self.cycles-pp.copyin
1.24 ± 6% -1.1 0.10 ± 12% perf-profile.self.cycles-pp.free_unref_page_commit
1.07 ± 6% -1.0 0.11 ± 8% perf-profile.self.cycles-pp.rmqueue
0.59 ± 5% -0.5 0.12 ± 5% perf-profile.self.cycles-pp.__ip_append_data
0.46 ± 5% -0.4 0.11 ± 9% perf-profile.self.cycles-pp.__zone_watermark_ok
0.30 ± 4% -0.2 0.09 ± 5% perf-profile.self.cycles-pp.get_page_from_freelist
0.17 ± 2% -0.1 0.08 ± 4% perf-profile.self.cycles-pp.__raw_spin_lock_irqsave
0.11 ± 4% -0.1 0.05 ± 9% perf-profile.self.cycles-pp.free_unref_page_prepare
0.04 ± 33% +0.0 0.09 ± 5% perf-profile.self.cycles-pp.__free_one_page
0.00 +0.1 0.06 ± 8% perf-profile.self.cycles-pp._raw_spin_lock
0.07 ± 6% +0.1 0.13 ± 3% perf-profile.self.cycles-pp.check_heap_object
0.00 +0.1 0.06 ± 4% perf-profile.self.cycles-pp.__skb_datagram_iter
0.00 +0.1 0.07 ± 18% perf-profile.self.cycles-pp.intel_idle
0.00 +0.1 0.08 ± 3% perf-profile.self.cycles-pp.udp_recvmsg
0.00 +0.1 0.08 ± 17% perf-profile.self.cycles-pp.try_to_wake_up
0.00 +0.1 0.10 ± 18% perf-profile.self.cycles-pp.cpu_util
0.06 +0.1 0.18 perf-profile.self.cycles-pp.rmqueue_bulk
0.00 +0.4 0.45 ± 18% perf-profile.self.cycles-pp.update_sg_lb_stats
0.61 ± 9% +1.8 2.41 ± 2% perf-profile.self.cycles-pp.copyout
55.16 +34.9 90.04 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
reply other threads:[~2025-02-21 8:24 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=202502211544.6fa1d77f-lkp@intel.com \
--to=oliver.sang@intel.com \
--cc=jason.zeng@intel.com \
--cc=lin.x.wang@intel.com \
--cc=lkp@intel.com \
--cc=oe-lkp@lists.linux.dev \
--cc=pei.p.jia@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.