From: kernel test robot <oliver.sang@intel.com>
To: Eric Dumazet <edumazet@google.com>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
<linux-kernel@vger.kernel.org>, Jakub Kicinski <kuba@kernel.org>,
Jason Xing <kerneljasonxing@gmail.com>,
Kuniyuki Iwashima <kuniyu@amazon.com>, <netdev@vger.kernel.org>,
<oliver.sang@intel.com>
Subject: [linus:master] [tcp] 86c2bc293b: stress-ng.sockmany.ops_per_sec 6.8% improvement
Date: Tue, 10 Jun 2025 21:57:53 +0800 [thread overview]
Message-ID: <202506102156.1d2bde14-lkp@intel.com> (raw)
Hello,
kernel test robot noticed a 6.8% improvement of stress-ng.sockmany.ops_per_sec on:
commit: 86c2bc293b8130aec9fa504e953531a84a6eb9a6 ("tcp: use RCU lookup in __inet_hash_connect()")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory
parameters:
nr_threads: 100%
testtime: 60s
test: sockmany
cpufreq_governor: performance
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250610/202506102156.1d2bde14-lkp@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-12/performance/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-spr-r02/sockmany/stress-ng/60s
commit:
d186f405fd ("tcp: add RCU management to inet_bind_bucket")
86c2bc293b ("tcp: use RCU lookup in __inet_hash_connect()")
d186f405fdf4229d 86c2bc293b8130aec9fa504e953
---------------- ---------------------------
%stddev %change %stddev
\ | \
0.62 ± 3% +0.1 0.69 ± 2% mpstat.cpu.all.irq%
521879 -1.5% 514052 vmstat.system.in
4059292 +6.8% 4335271 stress-ng.sockmany.ops
67315 +6.8% 71863 stress-ng.sockmany.ops_per_sec
903062 +4.0% 939576 proc-vmstat.nr_slab_reclaimable
5715333 +5.7% 6043532 proc-vmstat.pgfree
30955 ± 4% -5.6% 29223 ± 3% proc-vmstat.pgreuse
617802 +12.5% 694736 ± 2% perf-c2c.DRAM.local
43535 ± 2% -55.2% 19524 ± 2% perf-c2c.HITM.local
13760 ± 4% -94.7% 726.83 ± 9% perf-c2c.HITM.remote
57296 ± 3% -64.7% 20251 ± 2% perf-c2c.HITM.total
4862651 ± 23% +26.2% 6137833 ± 6% sched_debug.cfs_rq:/.avg_vruntime.min
0.24 ± 6% +23.8% 0.30 ± 5% sched_debug.cfs_rq:/.h_nr_queued.stddev
4862651 ± 23% +26.2% 6137833 ± 6% sched_debug.cfs_rq:/.min_vruntime.min
0.24 ± 6% +23.3% 0.30 ± 6% sched_debug.cpu.nr_running.stddev
40590 ± 3% +18.8% 48233 ± 17% sched_debug.cpu.nr_switches.max
0.63 ± 12% +20.6% 0.76 ± 7% perf-sched.sch_delay.avg.ms.__cond_resched.stop_one_cpu.sched_exec.bprm_execve.part
0.32 ± 10% -41.2% 0.19 ± 18% perf-sched.sch_delay.avg.ms.schedule_timeout.wait_woken.sk_wait_data.tcp_recvmsg_locked
0.19 ±195% +772.8% 1.62 ± 82% perf-sched.sch_delay.max.ms.__cond_resched.task_work_run.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
2.30 ± 31% +51.8% 3.49 ± 12% perf-sched.sch_delay.max.ms.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep
20.10 -23.3% 15.41 perf-sched.total_wait_and_delay.average.ms
177307 +32.5% 234941 perf-sched.total_wait_and_delay.count.ms
20.04 -23.4% 15.36 perf-sched.total_wait_time.average.ms
125.96 ±110% -73.3% 33.69 ± 17% perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
13.68 -25.7% 10.16 perf-sched.wait_and_delay.avg.ms.schedule_timeout.inet_csk_accept.inet_accept.do_accept
0.65 ± 10% -41.0% 0.38 ± 18% perf-sched.wait_and_delay.avg.ms.schedule_timeout.wait_woken.sk_wait_data.tcp_recvmsg_locked
79042 +32.2% 104463 perf-sched.wait_and_delay.count.__cond_resched.__release_sock.release_sock.__inet_stream_connect.inet_stream_connect
81037 +34.4% 108937 perf-sched.wait_and_delay.count.schedule_timeout.inet_csk_accept.inet_accept.do_accept
1965 ± 9% +125.3% 4427 ± 3% perf-sched.wait_and_delay.count.schedule_timeout.wait_woken.sk_wait_data.tcp_recvmsg_locked
2427 ± 3% +12.5% 2729 ± 2% perf-sched.wait_and_delay.count.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
13.36 ± 2% -25.0% 10.02 perf-sched.wait_time.avg.ms.__cond_resched.__release_sock.release_sock.tcp_sendmsg.__sys_sendto
13.66 -25.7% 10.15 perf-sched.wait_time.avg.ms.schedule_timeout.inet_csk_accept.inet_accept.do_accept
0.33 ± 10% -40.8% 0.19 ± 18% perf-sched.wait_time.avg.ms.schedule_timeout.wait_woken.sk_wait_data.tcp_recvmsg_locked
35.56 +15.4% 41.03 perf-stat.i.MPKI
1.386e+10 +3.1% 1.428e+10 perf-stat.i.branch-instructions
2.15 +0.1 2.26 perf-stat.i.branch-miss-rate%
2.923e+08 +8.8% 3.182e+08 perf-stat.i.branch-misses
71.48 +5.8 77.26 perf-stat.i.cache-miss-rate%
2.391e+09 +24.9% 2.985e+09 perf-stat.i.cache-misses
3.296e+09 +15.3% 3.802e+09 perf-stat.i.cache-references
9.36 -7.4% 8.66 perf-stat.i.cpi
291.67 -17.3% 241.22 perf-stat.i.cycles-between-cache-misses
7.053e+10 +8.2% 7.631e+10 perf-stat.i.instructions
0.12 +7.1% 0.13 perf-stat.i.ipc
34.03 +14.9% 39.11 perf-stat.overall.MPKI
2.11 +0.1 2.23 perf-stat.overall.branch-miss-rate%
72.58 +5.9 78.51 perf-stat.overall.cache-miss-rate%
9.04 -7.8% 8.34 perf-stat.overall.cpi
265.78 -19.8% 213.18 perf-stat.overall.cycles-between-cache-misses
0.11 +8.5% 0.12 perf-stat.overall.ipc
1.359e+10 +3.4% 1.405e+10 perf-stat.ps.branch-instructions
2.863e+08 +9.3% 3.129e+08 perf-stat.ps.branch-misses
2.353e+09 +24.7% 2.935e+09 perf-stat.ps.cache-misses
3.242e+09 +15.3% 3.739e+09 perf-stat.ps.cache-references
6.915e+10 +8.5% 7.506e+10 perf-stat.ps.instructions
4.246e+12 +8.2% 4.596e+12 perf-stat.total.instructions
66.41 ± 70% -49.8 16.57 ±223% perf-profile.calltrace.cycles-pp.stress_sockmany
66.32 ± 70% -49.8 16.54 ±223% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.connect.stress_sockmany
66.32 ± 70% -49.8 16.54 ±223% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.connect.stress_sockmany
66.32 ± 70% -49.8 16.54 ±223% perf-profile.calltrace.cycles-pp.connect.stress_sockmany
66.31 ± 70% -49.8 16.54 ±223% perf-profile.calltrace.cycles-pp.__sys_connect.__x64_sys_connect.do_syscall_64.entry_SYSCALL_64_after_hwframe.connect
66.31 ± 70% -49.8 16.54 ±223% perf-profile.calltrace.cycles-pp.__x64_sys_connect.do_syscall_64.entry_SYSCALL_64_after_hwframe.connect.stress_sockmany
66.31 ± 70% -49.8 16.54 ±223% perf-profile.calltrace.cycles-pp.__inet_stream_connect.inet_stream_connect.__sys_connect.__x64_sys_connect.do_syscall_64
66.31 ± 70% -49.8 16.54 ±223% perf-profile.calltrace.cycles-pp.inet_stream_connect.__sys_connect.__x64_sys_connect.do_syscall_64.entry_SYSCALL_64_after_hwframe
66.25 ± 70% -49.7 16.52 ±223% perf-profile.calltrace.cycles-pp.tcp_v4_connect.__inet_stream_connect.inet_stream_connect.__sys_connect.__x64_sys_connect
66.09 ± 70% -49.6 16.48 ±223% perf-profile.calltrace.cycles-pp.__inet_hash_connect.tcp_v4_connect.__inet_stream_connect.inet_stream_connect.__sys_connect
54.17 ± 70% -38.3 15.86 ±223% perf-profile.calltrace.cycles-pp.__inet_check_established.__inet_hash_connect.tcp_v4_connect.__inet_stream_connect.inet_stream_connect
10.32 ± 70% -10.3 0.00 perf-profile.calltrace.cycles-pp._raw_spin_lock_bh.__inet_hash_connect.tcp_v4_connect.__inet_stream_connect.inet_stream_connect
4.67 ± 70% -4.7 0.00 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_bh.__inet_hash_connect.tcp_v4_connect.__inet_stream_connect
66.53 ± 70% -49.9 16.60 ±223% perf-profile.children.cycles-pp.do_syscall_64
66.53 ± 70% -49.9 16.60 ±223% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
66.41 ± 70% -49.8 16.57 ±223% perf-profile.children.cycles-pp.stress_sockmany
66.33 ± 70% -49.8 16.54 ±223% perf-profile.children.cycles-pp.connect
66.31 ± 70% -49.8 16.54 ±223% perf-profile.children.cycles-pp.__inet_stream_connect
66.31 ± 70% -49.8 16.54 ±223% perf-profile.children.cycles-pp.__sys_connect
66.31 ± 70% -49.8 16.54 ±223% perf-profile.children.cycles-pp.__x64_sys_connect
66.31 ± 70% -49.8 16.54 ±223% perf-profile.children.cycles-pp.inet_stream_connect
66.25 ± 70% -49.7 16.52 ±223% perf-profile.children.cycles-pp.tcp_v4_connect
66.21 ± 70% -49.7 16.50 ±223% perf-profile.children.cycles-pp.__inet_hash_connect
54.25 ± 70% -38.4 15.89 ±223% perf-profile.children.cycles-pp.__inet_check_established
10.37 ± 70% -10.4 0.00 perf-profile.children.cycles-pp._raw_spin_lock_bh
4.67 ± 70% -4.7 0.00 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
53.42 ± 70% -37.8 15.58 ±223% perf-profile.self.cycles-pp.__inet_check_established
5.65 ± 70% -5.6 0.00 perf-profile.self.cycles-pp._raw_spin_lock_bh
4.62 ± 70% -4.6 0.00 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
reply other threads:[~2025-06-10 13:58 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=202506102156.1d2bde14-lkp@intel.com \
--to=oliver.sang@intel.com \
--cc=edumazet@google.com \
--cc=kerneljasonxing@gmail.com \
--cc=kuba@kernel.org \
--cc=kuniyu@amazon.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lkp@intel.com \
--cc=netdev@vger.kernel.org \
--cc=oe-lkp@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.