All of lore.kernel.org
 help / color / mirror / Atom feed
From: kernel test robot <oliver.sang@intel.com>
To: Eric Dumazet <edumazet@google.com>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
	<netdev@vger.kernel.org>,
	"David S . Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Neal Cardwell <ncardwell@google.com>,
	Kuniyuki Iwashima <kuniyu@amazon.com>,
	Jason Xing <kerneljasonxing@gmail.com>,
	Simon Horman <horms@kernel.org>, <eric.dumazet@gmail.com>,
	Eric Dumazet <edumazet@google.com>, <oliver.sang@intel.com>
Subject: Re: [PATCH net-next 4/4] tcp: use RCU lookup in __inet_hash_connect()
Date: Mon, 10 Mar 2025 22:03:06 +0800	[thread overview]
Message-ID: <202503102159.5f78c207-lkp@intel.com> (raw)
In-Reply-To: <20250302124237.3913746-5-edumazet@google.com>



Hello,

kernel test robot noticed a 6.9% improvement of stress-ng.sockmany.ops_per_sec on:


commit: ba6c94b99d772f431fd589dd2cd606b59063557b ("[PATCH net-next 4/4] tcp: use RCU lookup in __inet_hash_connect()")
url: https://github.com/intel-lab-lkp/linux/commits/Eric-Dumazet/tcp-use-RCU-in-__inet-6-_check_established/20250302-204711
base: https://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git f77f12010f67259bd0e1ad18877ed27c721b627a
patch link: https://lore.kernel.org/all/20250302124237.3913746-5-edumazet@google.com/
patch subject: [PATCH net-next 4/4] tcp: use RCU lookup in __inet_hash_connect()

testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory
parameters:

	nr_threads: 100%
	testtime: 60s
	test: sockmany
	cpufreq_governor: performance






Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250310/202503102159.5f78c207-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  gcc-12/performance/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-spr-r02/sockmany/stress-ng/60s

commit: 
  4f97f75a5b ("tcp: add RCU management to inet_bind_bucket")
  ba6c94b99d ("tcp: use RCU lookup in __inet_hash_connect()")

4f97f75a5bfa79ba ba6c94b99d772f431fd589dd2cd 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
   1742139 ± 89%     -91.6%     146373 ± 56%  numa-meminfo.node1.Unevictable
      0.61 ±  3%      +0.1        0.71 ±  3%  mpstat.cpu.all.irq%
      0.42            +0.0        0.46 ±  2%  mpstat.cpu.all.usr%
    435534 ± 89%     -91.6%      36593 ± 56%  numa-vmstat.node1.nr_unevictable
    435534 ± 89%     -91.6%      36593 ± 56%  numa-vmstat.node1.nr_zone_unevictable
   4057584            +7.0%    4340521        stress-ng.sockmany.ops
     67264            +6.9%      71933        stress-ng.sockmany.ops_per_sec
    604900           +12.3%     679404 ±  4%  perf-c2c.DRAM.local
     42998 ±  2%     -55.7%      19034 ±  3%  perf-c2c.HITM.local
     13764 ±  4%     -95.2%     663.67 ± 13%  perf-c2c.HITM.remote
     56762 ±  2%     -65.3%      19698 ±  4%  perf-c2c.HITM.total
   7422009           +13.2%    8403980 ±  2%  sched_debug.cfs_rq:/.avg_vruntime.max
    195564 ±  5%     +62.7%     318178 ± 10%  sched_debug.cfs_rq:/.avg_vruntime.stddev
      0.23 ±  7%     +25.4%       0.29 ±  4%  sched_debug.cfs_rq:/.h_nr_queued.stddev
     39935 ±  4%     +27.0%      50726 ± 29%  sched_debug.cfs_rq:/.load_avg.max
   7422009           +13.2%    8403980 ±  2%  sched_debug.cfs_rq:/.min_vruntime.max
    195564 ±  5%     +62.7%     318178 ± 10%  sched_debug.cfs_rq:/.min_vruntime.stddev
      0.23 ±  6%     +26.6%       0.29 ±  4%  sched_debug.cpu.nr_running.stddev
    387640            +5.9%     410501 ±  9%  proc-vmstat.nr_active_anon
    109911 ±  2%      +8.5%     119206 ±  2%  proc-vmstat.nr_mapped
    200627            +1.9%     204454        proc-vmstat.nr_shmem
    895041            +4.9%     939289        proc-vmstat.nr_slab_reclaimable
   2982921            +5.0%    3131084        proc-vmstat.nr_slab_unreclaimable
    387640            +5.9%     410501 ±  9%  proc-vmstat.nr_zone_active_anon
   2071760            +2.0%    2112591        proc-vmstat.numa_hit
   1839824            +2.2%    1880606        proc-vmstat.numa_local
   5905025            +5.2%    6210697        proc-vmstat.pgalloc_normal
   5291411 ± 12%     +11.9%    5921072        proc-vmstat.pgfree
      0.82 ± 13%     -29.0%       0.58 ±  6%  perf-sched.sch_delay.avg.ms.__cond_resched.__inet_hash_connect.tcp_v4_connect.__inet_stream_connect.inet_stream_connect
      4.50 ± 16%     +29.5%       5.83 ± 15%  perf-sched.sch_delay.max.ms.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write
      0.03 ± 56%     -88.8%       0.00 ±223%  perf-sched.sch_delay.max.ms.__cond_resched.stop_one_cpu.migrate_task_to.task_numa_migrate.isra
      0.07 ±125%   +3754.0%       2.67 ± 71%  perf-sched.sch_delay.max.ms.__cond_resched.ww_mutex_lock.drm_gem_vunmap_unlocked.drm_gem_fb_vunmap.drm_atomic_helper_commit_planes
     19.83           -22.3%      15.41        perf-sched.total_wait_and_delay.average.ms
    177991           +32.7%     236147        perf-sched.total_wait_and_delay.count.ms
     19.76           -22.3%      15.35        perf-sched.total_wait_time.average.ms
      1.64 ± 12%     -28.9%       1.17 ±  6%  perf-sched.wait_and_delay.avg.ms.__cond_resched.__inet_hash_connect.tcp_v4_connect.__inet_stream_connect.inet_stream_connect
     13.69           -26.2%      10.10        perf-sched.wait_and_delay.avg.ms.schedule_timeout.inet_csk_accept.inet_accept.do_accept
      6844           +11.8%       7651 ±  3%  perf-sched.wait_and_delay.count.__cond_resched.__inet_hash_connect.tcp_v4_connect.__inet_stream_connect.inet_stream_connect
     78701           +33.6%     105168        perf-sched.wait_and_delay.count.__cond_resched.__release_sock.release_sock.__inet_stream_connect.inet_stream_connect
     81026           +35.2%     109539        perf-sched.wait_and_delay.count.schedule_timeout.inet_csk_accept.inet_accept.do_accept
      2268 ± 14%     +90.6%       4325 ±  6%  perf-sched.wait_and_delay.count.schedule_timeout.wait_woken.sk_wait_data.tcp_recvmsg_locked
      0.82 ± 12%     -28.6%       0.59 ±  6%  perf-sched.wait_time.avg.ms.__cond_resched.__inet_hash_connect.tcp_v4_connect.__inet_stream_connect.inet_stream_connect
     13.49           -26.5%       9.91        perf-sched.wait_time.avg.ms.__cond_resched.__release_sock.release_sock.tcp_sendmsg.__sys_sendto
      3.05 ±  3%     +16.5%       3.55 ±  3%  perf-sched.wait_time.avg.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
     30.10 ± 20%     -64.4%      10.72 ±113%  perf-sched.wait_time.avg.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call
      1.14 ±  9%     +22.2%       1.40 ±  7%  perf-sched.wait_time.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
     13.67           -26.3%      10.08        perf-sched.wait_time.avg.ms.schedule_timeout.inet_csk_accept.inet_accept.do_accept
      7.36 ± 57%    +103.9%      15.01 ± 27%  perf-sched.wait_time.avg.ms.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
      0.03 ± 56%     -88.8%       0.00 ±223%  perf-sched.wait_time.max.ms.__cond_resched.stop_one_cpu.migrate_task_to.task_numa_migrate.isra
      0.07 ±125%    +4e+05%     275.31 ±115%  perf-sched.wait_time.max.ms.__cond_resched.ww_mutex_lock.drm_gem_vunmap_unlocked.drm_gem_fb_vunmap.drm_atomic_helper_commit_planes
     35.70           +15.3%      41.18        perf-stat.i.MPKI
 1.368e+10            +4.6%  1.431e+10        perf-stat.i.branch-instructions
      2.15            +0.1        2.27        perf-stat.i.branch-miss-rate%
 2.884e+08           +10.7%  3.192e+08        perf-stat.i.branch-misses
     71.62            +5.5       77.09        perf-stat.i.cache-miss-rate%
 2.377e+09           +26.3%  3.003e+09        perf-stat.i.cache-misses
 3.264e+09           +17.4%  3.832e+09        perf-stat.i.cache-references
      9.40            -8.1%       8.64        perf-stat.i.cpi
    292.27           -18.0%     239.70        perf-stat.i.cycles-between-cache-misses
 6.963e+10            +9.8%  7.645e+10        perf-stat.i.instructions
      0.12 ±  2%      +7.3%       0.13        perf-stat.i.ipc
     34.12           +15.0%      39.25        perf-stat.overall.MPKI
      2.11            +0.1        2.23        perf-stat.overall.branch-miss-rate%
     72.81            +5.5       78.36        perf-stat.overall.cache-miss-rate%
      9.07            -8.4%       8.31        perf-stat.overall.cpi
    265.92           -20.4%     211.72        perf-stat.overall.cycles-between-cache-misses
      0.11            +9.2%       0.12        perf-stat.overall.ipc
 1.345e+10            +4.6%  1.408e+10        perf-stat.ps.branch-instructions
 2.835e+08           +10.7%  3.139e+08        perf-stat.ps.branch-misses
 2.337e+09           +26.3%  2.952e+09        perf-stat.ps.cache-misses
 3.209e+09           +17.4%  3.768e+09        perf-stat.ps.cache-references
 6.849e+10            +9.8%  7.521e+10        perf-stat.ps.instructions
 4.236e+12            +9.1%  4.621e+12        perf-stat.total.instructions




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


  parent reply	other threads:[~2025-03-10 14:03 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-02 12:42 [PATCH net-next 0/4] tcp: scale connect() under pressure Eric Dumazet
2025-03-02 12:42 ` [PATCH net-next 1/4] tcp: use RCU in __inet{6}_check_established() Eric Dumazet
2025-03-03  0:24   ` Jason Xing
2025-03-04  0:20   ` Kuniyuki Iwashima
2025-03-02 12:42 ` [PATCH net-next 2/4] tcp: optimize inet_use_bhash2_on_bind() Eric Dumazet
2025-03-03  0:24   ` Jason Xing
2025-03-04  0:22   ` Kuniyuki Iwashima
2025-03-02 12:42 ` [PATCH net-next 3/4] tcp: add RCU management to inet_bind_bucket Eric Dumazet
2025-03-03  0:57   ` Jason Xing
2025-03-04  0:43   ` Kuniyuki Iwashima
2025-03-02 12:42 ` [PATCH net-next 4/4] tcp: use RCU lookup in __inet_hash_connect() Eric Dumazet
2025-03-03  1:07   ` Jason Xing
2025-03-03 10:25     ` Eric Dumazet
2025-03-03 10:39       ` Jason Xing
2025-03-04  0:51   ` Kuniyuki Iwashima
2025-03-10 14:03   ` kernel test robot [this message]
2025-03-05  2:00 ` [PATCH net-next 0/4] tcp: scale connect() under pressure patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=202503102159.5f78c207-lkp@intel.com \
    --to=oliver.sang@intel.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=eric.dumazet@gmail.com \
    --cc=horms@kernel.org \
    --cc=kerneljasonxing@gmail.com \
    --cc=kuba@kernel.org \
    --cc=kuniyu@amazon.com \
    --cc=lkp@intel.com \
    --cc=ncardwell@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=oe-lkp@lists.linux.dev \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.