* [opencloudos:next] [sched] 44f5072e76: netperf.Throughput_Mbps 14.4% improvement
@ 2024-09-24 12:42 kernel test robot
0 siblings, 0 replies; only message in thread
From: kernel test robot @ 2024-09-24 12:42 UTC (permalink / raw)
To: kaixuxia, frankjpliu, kasong, sagazchen, kernelxing, aurelianliu,
deshengwu, flyingpeng, jason.zeng, wu.zheng, yingbao.jia,
pei.p.jia
Cc: oe-lkp, lkp, oliver.sang
Hello,
kernel test robot noticed a 14.4% improvement of netperf.Throughput_Mbps on:
commit: 44f5072e7684629650ca645a35698d5388c23ad7 ("Revert "sched: adaptive default skew_tick value"")
https://gitee.com/OpenCloudOS/OpenCloudOS-Kernel.git next
testcase: netperf
test machine: 256 threads 4 sockets INTEL(R) XEON(R) PLATINUM 8592+ (Emerald Rapids) with 256G memory
parameters:
ip: ipv4
runtime: 300s
nr_threads: 50%
cluster: cs-localhost
send_size: 10K
test: SCTP_STREAM_MANY
cpufreq_governor: performance
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240924/202409241630.7e2e7b8a-oliver.sang@intel.com
=========================================================================================
cluster/compiler/cpufreq_governor/ip/kconfig/nr_threads/rootfs/runtime/send_size/tbox_group/test/testcase:
cs-localhost/gcc-12/performance/ipv4/x86_64-oc_stream_base_config/50%/debian-12-x86_64-20240206.cgz/300s/10K/lkp-emr-2sp1/SCTP_STREAM_MANY/netperf
commit:
a1aa259039 ("Merge branch 'likexu/kvm/cube-optimization' into 'master' (merge request !158)")
44f5072e76 ("Revert "sched: adaptive default skew_tick value"")
a1aa2590392cbeea 44f5072e7684629650ca645a356
---------------- ---------------------------
%stddev %change %stddev
\ | \
3.096e+08 ± 2% +13.9% 3.526e+08 ± 3% cpuidle..usage
0.00 ± 28% -42.6% 0.00 ± 11% sched_debug.cpu.next_balance.stddev
52332 ± 21% +78.9% 93622 ± 20% sched_debug.cpu.nr_switches.min
17.74 ± 2% +15.3% 20.45 ± 2% vmstat.procs.r
1985288 ± 2% +14.7% 2276309 ± 3% vmstat.system.cs
53740 ± 2% +8.4% 58264 ± 3% vmstat.system.in
0.99 ± 2% +0.2 1.23 ± 2% mpstat.cpu.all.soft%
5.10 ± 2% +0.9 6.05 ± 3% mpstat.cpu.all.sys%
0.19 +0.0 0.22 mpstat.cpu.all.usr%
8.35 ± 3% +23.1% 10.28 ± 3% mpstat.max_utilization_pct
17.50 ± 21% +46.7% 25.67 ± 10% perf-c2c.DRAM.local
1456 ± 5% +124.4% 3269 ± 2% perf-c2c.DRAM.remote
9427 ± 6% +17.7% 11098 ± 6% perf-c2c.HITM.local
720.17 ± 5% +39.4% 1004 ± 6% perf-c2c.HITM.remote
10148 ± 6% +19.3% 12102 ± 5% perf-c2c.HITM.total
5468174 +12.3% 6142120 ± 5% meminfo.Cached
3087455 ± 3% +21.8% 3760849 ± 8% meminfo.Committed_AS
2141836 ± 4% +31.5% 2815792 ± 11% meminfo.Inactive
2140829 ± 4% +31.4% 2813119 ± 11% meminfo.Inactive(anon)
81575 +20.7% 98438 ± 8% meminfo.Mapped
2148304 ± 4% +31.3% 2820586 ± 11% meminfo.Shmem
4117 ± 88% +73.3% 7133 ± 50% numa-vmstat.node2.nr_mapped
562669 ± 8% +28.1% 720575 ± 14% numa-vmstat.node3.nr_file_pages
529864 ± 5% +32.2% 700439 ± 11% numa-vmstat.node3.nr_inactive_anon
9314 ± 2% +38.3% 12876 ± 16% numa-vmstat.node3.nr_mapped
529765 ± 5% +32.2% 700351 ± 11% numa-vmstat.node3.nr_shmem
529864 ± 5% +32.2% 700439 ± 11% numa-vmstat.node3.nr_zone_inactive_anon
16353 ± 87% +73.6% 28387 ± 49% numa-meminfo.node2.Mapped
2250470 ± 8% +28.0% 2880924 ± 14% numa-meminfo.node3.FilePages
2120053 ± 5% +32.1% 2800479 ± 11% numa-meminfo.node3.Inactive
2119249 ± 5% +32.1% 2800379 ± 11% numa-meminfo.node3.Inactive(anon)
35285 ± 2% +39.5% 49230 ± 15% numa-meminfo.node3.Mapped
2858241 ± 4% +25.2% 3577576 ± 9% numa-meminfo.node3.MemUsed
2118855 ± 5% +32.1% 2800027 ± 11% numa-meminfo.node3.Shmem
2420 ± 2% +14.4% 2770 ± 3% netperf.ThroughputBoth_Mbps
309868 ± 2% +14.4% 354616 ± 3% netperf.ThroughputBoth_total_Mbps
2420 ± 2% +14.4% 2770 ± 3% netperf.Throughput_Mbps
309868 ± 2% +14.4% 354616 ± 3% netperf.Throughput_total_Mbps
11228 ± 2% +27.6% 14328 ± 3% netperf.time.involuntary_context_switches
1036 +15.9% 1200 ± 3% netperf.time.percent_of_cpu_this_job_got
3068 +15.9% 3556 ± 3% netperf.time.system_time
59.04 +11.4% 65.76 ± 2% netperf.time.user_time
1.135e+09 ± 2% +14.4% 1.299e+09 ± 3% netperf.workload
1366778 +12.4% 1535834 ± 5% proc-vmstat.nr_file_pages
534938 ± 4% +31.5% 703580 ± 11% proc-vmstat.nr_inactive_anon
20750 +20.4% 24985 ± 8% proc-vmstat.nr_mapped
536809 ± 4% +31.4% 705449 ± 11% proc-vmstat.nr_shmem
534938 ± 4% +31.5% 703580 ± 11% proc-vmstat.nr_zone_inactive_anon
1.466e+09 ± 2% +14.5% 1.678e+09 ± 3% proc-vmstat.numa_hit
1.464e+09 ± 2% +14.5% 1.676e+09 ± 3% proc-vmstat.numa_local
8.422e+09 ± 2% +14.5% 9.639e+09 ± 3% proc-vmstat.pgalloc_normal
1647051 +1.8% 1677130 proc-vmstat.pgfault
8.421e+09 ± 2% +14.5% 9.638e+09 ± 3% proc-vmstat.pgfree
1.169e+10 ± 2% +14.7% 1.341e+10 ± 3% perf-stat.i.branch-instructions
0.48 -0.0 0.47 perf-stat.i.branch-miss-rate%
55170417 ± 2% +11.1% 61308068 ± 2% perf-stat.i.branch-misses
80142341 +11.7% 89535369 ± 2% perf-stat.i.cache-misses
1.68e+09 ± 2% +14.6% 1.925e+09 ± 3% perf-stat.i.cache-references
2005173 ± 2% +14.8% 2301925 ± 3% perf-stat.i.context-switches
7.149e+10 ± 2% +15.4% 8.253e+10 ± 4% perf-stat.i.cpu-cycles
474.46 +3.5% 491.30 perf-stat.i.cpu-migrations
6.504e+10 ± 2% +14.7% 7.462e+10 ± 3% perf-stat.i.instructions
7.83 ± 2% +14.8% 8.99 ± 3% perf-stat.i.metric.K/sec
5073 +2.2% 5184 perf-stat.i.minor-faults
5073 +2.2% 5184 perf-stat.i.page-faults
0.02 ± 6% -25.8% 0.02 ± 12% perf-sched.sch_delay.avg.ms.do_wait.kernel_wait4.__do_sys_wait4.__x64_sys_wait4
0.05 ± 4% -19.0% 0.04 ± 10% perf-sched.sch_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
0.09 ± 2% +12.2% 0.10 ± 3% perf-sched.sch_delay.avg.ms.schedule_timeout.sctp_wait_for_sndbuf.sctp_sendmsg_to_asoc.sctp_sendmsg
0.11 ± 8% +3964.3% 4.51 ±216% perf-sched.sch_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
0.30 ± 5% -18.8% 0.25 ± 13% perf-sched.sch_delay.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
1.87 ± 5% -25.5% 1.40 ± 11% perf-sched.total_wait_and_delay.average.ms
2029785 ± 5% +32.8% 2695971 ± 11% perf-sched.total_wait_and_delay.count.ms
1.87 ± 5% -25.7% 1.39 ± 11% perf-sched.total_wait_time.average.ms
25.91 ± 63% -59.0% 10.62 ± 21% perf-sched.wait_and_delay.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64
189.87 ± 16% +91.0% 362.66 ± 19% perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range_clock.schedule_hrtimeout_range.do_poll.constprop.0
0.30 ± 5% -24.8% 0.23 ± 11% perf-sched.wait_and_delay.avg.ms.schedule_timeout.sctp_skb_recv_datagram.sctp_recvmsg.inet_recvmsg
49.67 ± 13% +52.0% 75.50 ± 19% perf-sched.wait_and_delay.count.__cond_resched.__release_sock.release_sock.sctp_sendmsg.inet_sendmsg
1536 -16.7% 1280 perf-sched.wait_and_delay.count.__cond_resched.__wait_for_common.wait_for_completion.affine_move_task.__set_cpus_allowed_ptr_locked
29.33 ± 13% -28.4% 21.00 ± 10% perf-sched.wait_and_delay.count.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
66.50 ± 15% -47.9% 34.67 ± 21% perf-sched.wait_and_delay.count.schedule_hrtimeout_range_clock.schedule_hrtimeout_range.do_poll.constprop.0
1991444 ± 5% +33.5% 2658433 ± 11% perf-sched.wait_and_delay.count.schedule_timeout.sctp_skb_recv_datagram.sctp_recvmsg.inet_recvmsg
2429 -10.2% 2181 perf-sched.wait_and_delay.count.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
2.74 -9.3% 2.48 perf-sched.wait_time.avg.ms.__cond_resched.__wait_for_common.wait_for_completion.affine_move_task.__set_cpus_allowed_ptr_locked
25.87 ± 63% -59.1% 10.58 ± 21% perf-sched.wait_time.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64
189.78 ± 16% +91.0% 362.52 ± 19% perf-sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.schedule_hrtimeout_range.do_poll.constprop.0
0.30 ± 5% -25.0% 0.22 ± 11% perf-sched.wait_time.avg.ms.schedule_timeout.sctp_skb_recv_datagram.sctp_recvmsg.inet_recvmsg
49.66 ±142% -87.0% 6.45 ± 92% perf-sched.wait_time.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.irqentry_exit
0.84 ± 2% +0.0 0.89 ± 2% perf-profile.calltrace.cycles-pp.__sk_mem_reclaim.sctp_wfree.skb_release_head_state.consume_skb.sctp_chunk_put
0.82 ± 3% +0.1 0.88 ± 2% perf-profile.calltrace.cycles-pp.__sk_mem_reduce_allocated.__sk_mem_reclaim.sctp_wfree.skb_release_head_state.consume_skb
1.57 +0.1 1.63 perf-profile.calltrace.cycles-pp.sctp_wfree.skb_release_head_state.consume_skb.sctp_chunk_put.sctp_chunk_free
1.65 +0.1 1.72 perf-profile.calltrace.cycles-pp.skb_release_head_state.consume_skb.sctp_chunk_put.sctp_chunk_free.sctp_outq_sack
4.22 -0.1 4.08 ± 2% perf-profile.children.cycles-pp.__schedule
2.01 -0.1 1.90 ± 3% perf-profile.children.cycles-pp.schedule_idle
0.14 ± 11% -0.1 0.05 ± 8% perf-profile.children.cycles-pp.nohz_run_idle_balance
0.27 ± 4% -0.1 0.22 ± 4% perf-profile.children.cycles-pp.tick_nohz_idle_exit
0.22 ± 9% -0.0 0.17 ± 5% perf-profile.children.cycles-pp.tick_nohz_idle_stop_tick
0.20 ± 6% -0.0 0.16 ± 6% perf-profile.children.cycles-pp.tick_nohz_stop_tick
0.78 -0.0 0.74 ± 2% perf-profile.children.cycles-pp.sctp_packet_transmit_chunk
0.18 ± 7% -0.0 0.14 ± 5% perf-profile.children.cycles-pp.quiet_vmstat
0.10 ± 5% -0.0 0.06 ± 11% perf-profile.children.cycles-pp.tick_nohz_restart_sched_tick
0.11 ± 3% -0.0 0.10 ± 5% perf-profile.children.cycles-pp.syscall_enter_from_user_mode
0.05 ± 7% +0.0 0.07 ± 5% perf-profile.children.cycles-pp.tick_program_event
0.53 +0.0 0.55 ± 2% perf-profile.children.cycles-pp.drain_stock
0.18 ± 5% +0.0 0.20 ± 3% perf-profile.children.cycles-pp.update_sg_lb_stats
0.04 ± 44% +0.0 0.07 ± 7% perf-profile.children.cycles-pp.idle_cpu
0.22 ± 6% +0.0 0.25 ± 4% perf-profile.children.cycles-pp.update_sd_lb_stats
0.22 ± 7% +0.0 0.26 ± 4% perf-profile.children.cycles-pp.find_busiest_group
0.90 ± 2% +0.0 0.94 perf-profile.children.cycles-pp.refill_stock
0.26 ± 5% +0.0 0.31 ± 4% perf-profile.children.cycles-pp.load_balance
0.14 ± 3% +0.1 0.20 ± 4% perf-profile.children.cycles-pp.tick_sched_handle
0.08 +0.1 0.14 ± 5% perf-profile.children.cycles-pp.scheduler_tick
1.68 +0.1 1.74 perf-profile.children.cycles-pp.sctp_wfree
0.13 ± 4% +0.1 0.19 ± 4% perf-profile.children.cycles-pp.update_process_times
0.00 +0.1 0.06 ± 11% perf-profile.children.cycles-pp.nohz_balancer_kick
0.06 ± 9% +0.1 0.13 ± 8% perf-profile.children.cycles-pp.update_blocked_averages
0.16 ± 2% +0.1 0.24 ± 5% perf-profile.children.cycles-pp.__hrtimer_run_queues
0.00 +0.1 0.07 ± 9% perf-profile.children.cycles-pp.trigger_load_balance
0.15 ± 3% +0.1 0.22 ± 5% perf-profile.children.cycles-pp.tick_sched_timer
0.08 ± 13% +0.1 0.16 ± 2% perf-profile.children.cycles-pp.rebalance_domains
4.48 +0.1 4.56 perf-profile.children.cycles-pp.consume_skb
1.98 ± 2% +0.1 2.06 ± 2% perf-profile.children.cycles-pp.__sk_mem_reduce_allocated
3.35 +0.1 3.45 perf-profile.children.cycles-pp.skb_release_head_state
0.21 ± 3% +0.1 0.31 ± 4% perf-profile.children.cycles-pp.hrtimer_interrupt
0.21 ± 2% +0.1 0.32 ± 4% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
0.19 ± 10% +0.1 0.31 ± 7% perf-profile.children.cycles-pp._nohz_idle_balance
0.52 ± 4% +0.1 0.64 ± 2% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
0.13 ± 11% +0.1 0.26 ± 8% perf-profile.children.cycles-pp.asm_sysvec_call_function_single
0.56 ± 4% +0.1 0.70 ± 2% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
0.12 ± 11% +0.1 0.26 ± 8% perf-profile.children.cycles-pp.sysvec_call_function_single
0.36 ± 7% +0.2 0.53 ± 6% perf-profile.children.cycles-pp.irq_exit_rcu
0.36 ± 6% +0.2 0.53 ± 6% perf-profile.children.cycles-pp.__irq_exit_rcu
0.20 ± 10% +0.2 0.37 ± 6% perf-profile.children.cycles-pp.run_rebalance_domains
0.10 -0.0 0.08 perf-profile.self.cycles-pp.syscall_enter_from_user_mode
0.09 ± 5% +0.0 0.10 ± 4% perf-profile.self.cycles-pp.ipv4_dst_check
0.18 ± 4% +0.0 0.20 ± 3% perf-profile.self.cycles-pp.ktime_get
0.09 ± 5% +0.0 0.11 ± 8% perf-profile.self.cycles-pp.lock_sock_nested
0.16 ± 4% +0.0 0.18 ± 3% perf-profile.self.cycles-pp.sctp_outq_tail
0.03 ± 70% +0.0 0.06 ± 7% perf-profile.self.cycles-pp.idle_cpu
0.00 +0.1 0.05 ± 7% perf-profile.self.cycles-pp.sctp_sf_do_prm_send
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2024-09-24 12:43 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-24 12:42 [opencloudos:next] [sched] 44f5072e76: netperf.Throughput_Mbps 14.4% improvement kernel test robot
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.