All of lore.kernel.org
 help / color / mirror / Atom feed
From: kernel test robot <oliver.sang@intel.com>
To: Eric Dumazet <edumazet@google.com>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
	<linux-kernel@vger.kernel.org>, Jakub Kicinski <kuba@kernel.org>,
	Kuniyuki Iwashima <kuniyu@google.com>, <netdev@vger.kernel.org>,
	<oliver.sang@intel.com>
Subject: [linus:master] [tcp]  1d2fbaad7c:  stress-ng.sigurg.ops_per_sec 12.2% regression
Date: Mon, 18 Aug 2025 12:48:06 +0800	[thread overview]
Message-ID: <202508180406.dbf438fc-lkp@intel.com> (raw)


Hello,

kernel test robot noticed a 12.2% regression of stress-ng.sigurg.ops_per_sec on:


commit: 1d2fbaad7cd8cc96899179f9898ad2787a15f0a0 ("tcp: stronger sk_rcvbuf checks")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

[still regression on      linus/master d7ee5bdce7892643409dea7266c34977e651b479]
[still regression on linux-next/master 1357b2649c026b51353c84ddd32bc963e8999603]
[still regression on        fix commit 972ca7a3bc9a136b15ba698713b056a4900e2634]

testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory
parameters:

	nr_threads: 100%
	testtime: 60s
	test: sigurg
	cpufreq_governor: performance


If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202508180406.dbf438fc-lkp@intel.com


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250818/202508180406.dbf438fc-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  gcc-12/performance/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-spr-r02/sigurg/stress-ng/60s

commit: 
  75dff0584c ("tcp: add const to tcp_try_rmem_schedule() and sk_rmem_schedule() skb")
  1d2fbaad7c ("tcp: stronger sk_rcvbuf checks")

75dff0584cce7920 1d2fbaad7cd8cc96899179f9898 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     36434            +7.6%      39205        vmstat.system.cs
   5683321           -13.3%    4926200 ±  2%  vmstat.system.in
    530991 ±  2%      -9.3%     481619 ±  3%  meminfo.Mapped
   1132865           -13.5%     979753        meminfo.SUnreclaim
   1292406           -11.9%    1138096        meminfo.Slab
      0.62 ±  2%      +0.1        0.70        mpstat.cpu.all.irq%
     24.14            -8.3       15.83 ±  2%  mpstat.cpu.all.soft%
     10.95            +2.3       13.22        mpstat.cpu.all.usr%
    627541 ±  4%     -15.4%     530831 ±  5%  numa-meminfo.node0.SUnreclaim
    721419 ±  3%     -14.0%     620592 ±  8%  numa-meminfo.node0.Slab
    513808 ±  6%     -13.1%     446297 ±  4%  numa-meminfo.node1.SUnreclaim
   6100681           -23.2%    4686698 ±  2%  numa-numastat.node0.local_node
   6205260           -22.6%    4802561 ±  2%  numa-numastat.node0.numa_hit
   5548582           -18.0%    4547552        numa-numastat.node1.local_node
   5676020           -17.8%    4663456        numa-numastat.node1.numa_hit
     22382 ±  2%     -37.0%      14107 ±  4%  perf-c2c.DRAM.local
     28565 ± 14%     -28.5%      20433 ± 19%  perf-c2c.DRAM.remote
     61612 ±  4%     -28.7%      43958 ± 10%  perf-c2c.HITM.local
     18329 ± 14%     -27.0%      13378 ± 19%  perf-c2c.HITM.remote
     79941           -28.3%      57336 ±  6%  perf-c2c.HITM.total
    155304 ±  4%     -14.4%     132870 ±  5%  numa-vmstat.node0.nr_slab_unreclaimable
   6217921           -22.8%    4801413 ±  2%  numa-vmstat.node0.numa_hit
   6113343           -23.4%    4685551 ±  2%  numa-vmstat.node0.numa_local
    127106 ±  6%     -12.0%     111885 ±  4%  numa-vmstat.node1.nr_slab_unreclaimable
   5686635           -18.0%    4662431        numa-vmstat.node1.numa_hit
   5559197           -18.2%    4546532        numa-vmstat.node1.numa_local
  3.39e+08           -12.2%  2.977e+08        stress-ng.sigurg.ops
   5652273           -12.2%    4963242        stress-ng.sigurg.ops_per_sec
   1885719           +11.0%    2092671        stress-ng.time.involuntary_context_switches
     16523           +11.2%      18365        stress-ng.time.percent_of_cpu_this_job_got
      8500            +9.2%       9278        stress-ng.time.system_time
      1438           +23.0%       1769        stress-ng.time.user_time
    195971            -6.0%     184305        stress-ng.time.voluntary_context_switches
    487113 ±  7%      -5.8%     459038        proc-vmstat.nr_active_anon
    134039            -9.5%     121269 ±  4%  proc-vmstat.nr_mapped
    186858 ± 20%     -15.3%     158269 ±  2%  proc-vmstat.nr_shmem
    284955 ±  2%     -13.8%     245616        proc-vmstat.nr_slab_unreclaimable
    487113 ±  7%      -5.8%     459038        proc-vmstat.nr_zone_active_anon
  11891822           -20.5%    9456122        proc-vmstat.numa_hit
  11659806           -20.9%    9224357        proc-vmstat.numa_local
  86214365           -22.0%   67254297        proc-vmstat.pgalloc_normal
  85564410           -21.8%   66883184        proc-vmstat.pgfree
   6156738           +13.9%    7012286        sched_debug.cfs_rq:/.avg_vruntime.avg
   7693151           +10.1%    8468818        sched_debug.cfs_rq:/.avg_vruntime.max
   4636464 ±  5%     +14.2%    5295369 ±  4%  sched_debug.cfs_rq:/.avg_vruntime.min
    238.39 ± 92%    +228.2%     782.32 ± 46%  sched_debug.cfs_rq:/.load_avg.avg
   6156739           +13.9%    7012287        sched_debug.cfs_rq:/.min_vruntime.avg
   7693151           +10.1%    8468818        sched_debug.cfs_rq:/.min_vruntime.max
   4636464 ±  5%     +14.2%    5295369 ±  4%  sched_debug.cfs_rq:/.min_vruntime.min
      2580 ±  3%     -13.3%       2236 ±  8%  sched_debug.cfs_rq:/.runnable_avg.max
    104496 ± 28%     -64.4%      37246 ± 38%  sched_debug.cpu.avg_idle.min
      1405 ±  3%     +12.7%       1583 ±  2%  sched_debug.cpu.nr_switches.stddev
      0.68 ±  3%     -40.9%       0.40 ±  3%  perf-stat.i.MPKI
 9.475e+10           +26.6%  1.199e+11        perf-stat.i.branch-instructions
      0.13 ±  5%      -0.0        0.09 ±  2%  perf-stat.i.branch-miss-rate%
 1.178e+08 ±  3%     -14.9%  1.003e+08        perf-stat.i.branch-misses
     40.25            -3.2       37.02        perf-stat.i.cache-miss-rate%
 3.325e+08 ±  2%     -25.9%  2.465e+08 ±  3%  perf-stat.i.cache-misses
 8.258e+08           -19.2%  6.672e+08        perf-stat.i.cache-references
     37598            +8.1%      40642        perf-stat.i.context-switches
      1.31           -21.4%       1.03        perf-stat.i.cpi
      2327           -15.3%       1970 ±  2%  perf-stat.i.cpu-migrations
      1927 ±  2%     +33.7%       2577 ±  3%  perf-stat.i.cycles-between-cache-misses
 4.888e+11           +26.3%  6.174e+11        perf-stat.i.instructions
      0.77           +26.9%       0.98        perf-stat.i.ipc
      0.68 ±  3%     -41.3%       0.40 ±  3%  perf-stat.overall.MPKI
      0.12 ±  4%      -0.0        0.08 ±  2%  perf-stat.overall.branch-miss-rate%
     40.27            -3.3       36.95        perf-stat.overall.cache-miss-rate%
      1.31           -21.5%       1.03        perf-stat.overall.cpi
      1928 ±  2%     +33.8%       2581 ±  3%  perf-stat.overall.cycles-between-cache-misses
      0.76           +27.4%       0.97        perf-stat.overall.ipc
 9.264e+10           +26.7%  1.173e+11        perf-stat.ps.branch-instructions
 1.148e+08 ±  3%     -14.8%   97834009        perf-stat.ps.branch-misses
 3.253e+08 ±  2%     -25.8%  2.413e+08 ±  3%  perf-stat.ps.cache-misses
 8.077e+08           -19.2%   6.53e+08        perf-stat.ps.cache-references
     36742            +8.2%      39741        perf-stat.ps.context-switches
      2273           -15.4%       1922 ±  2%  perf-stat.ps.cpu-migrations
 4.779e+11           +26.4%  6.041e+11        perf-stat.ps.instructions
 2.914e+13           +27.4%  3.711e+13        perf-stat.total.instructions
      4.41 ±  4%     -17.7%       3.63 ±  6%  perf-sched.sch_delay.avg.ms.__cond_resched.__release_sock.release_sock.tcp_recvmsg.inet_recvmsg
      6.74           -12.8%       5.87 ±  3%  perf-sched.sch_delay.avg.ms.__cond_resched.__release_sock.release_sock.tcp_sendmsg.__sys_sendto
      5.30           -19.6%       4.26 ±  4%  perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_node_noprof.__alloc_skb.tcp_stream_alloc_skb.tcp_sendmsg_locked
      5.22           -23.4%       4.00 ±  2%  perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_node_noprof.kmalloc_reserve.__alloc_skb.tcp_stream_alloc_skb
      4.83 ±  4%     -15.7%       4.08 ±  2%  perf-sched.sch_delay.avg.ms.__cond_resched.lock_sock_nested.tcp_recvmsg.inet_recvmsg.sock_recvmsg
      5.20 ±  2%     -17.9%       4.27        perf-sched.sch_delay.avg.ms.__cond_resched.lock_sock_nested.tcp_sendmsg.__sys_sendto.__x64_sys_sendto
      4.92 ±  2%     -16.5%       4.11 ±  2%  perf-sched.sch_delay.avg.ms.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
      5.75 ± 18%     -88.1%       0.69 ±115%  perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_common_interrupt.[unknown]
      5.00 ±  3%     -16.2%       4.19 ±  2%  perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
      0.35 ± 15%     -37.1%       0.22 ± 12%  perf-sched.sch_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
      3.47           -16.7%       2.89 ±  3%  perf-sched.sch_delay.avg.ms.schedule_timeout.wait_woken.sk_wait_data.tcp_recvmsg_locked
      0.13 ±  7%     -14.8%       0.11 ±  8%  perf-sched.sch_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
     33.82 ± 55%     -45.0%      18.60 ± 16%  perf-sched.sch_delay.max.ms.__cond_resched.__release_sock.release_sock.tcp_recvmsg.inet_recvmsg
     36.83 ± 10%     -32.7%      24.80 ± 10%  perf-sched.sch_delay.max.ms.__cond_resched.kmem_cache_alloc_node_noprof.__alloc_skb.tcp_stream_alloc_skb.tcp_sendmsg_locked
     10.05 ± 49%     -58.0%       4.22 ± 33%  perf-sched.sch_delay.max.ms.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep
     55.68 ±  9%     -16.5%      46.49 ± 17%  perf-sched.sch_delay.max.ms.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
      6.12 ± 18%     -81.2%       1.15 ±116%  perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_common_interrupt.[unknown]
      7.91 ± 27%     -39.8%       4.77 ± 24%  perf-sched.sch_delay.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
      1.73 ±104%     -99.1%       0.02 ± 44%  perf-sched.sch_delay.max.ms.schedule_timeout.kcompactd.kthread.ret_from_fork
      4.56 ±  2%     -15.3%       3.86 ±  2%  perf-sched.total_sch_delay.average.ms
     26.64 ±  2%      -9.6%      24.08        perf-sched.total_wait_and_delay.average.ms
     22.08 ±  2%      -8.5%      20.21 ±  2%  perf-sched.total_wait_time.average.ms
     13.50           -12.5%      11.82 ±  3%  perf-sched.wait_and_delay.avg.ms.__cond_resched.__release_sock.release_sock.tcp_sendmsg.__sys_sendto
     15.61          -100.0%       0.00        perf-sched.wait_and_delay.avg.ms.__cond_resched.kmem_cache_alloc_node_noprof.__alloc_skb.tcp_stream_alloc_skb.tcp_sendmsg_locked
      9.72 ±  4%     -15.7%       8.20        perf-sched.wait_and_delay.avg.ms.__cond_resched.lock_sock_nested.tcp_recvmsg.inet_recvmsg.sock_recvmsg
     15.17 ±  6%     -21.9%      11.85 ±  3%  perf-sched.wait_and_delay.avg.ms.__cond_resched.lock_sock_nested.tcp_sendmsg.__sys_sendto.__x64_sys_sendto
     11.74 ±  2%     -12.5%      10.27 ±  2%  perf-sched.wait_and_delay.avg.ms.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
     10.39 ±  4%     -13.0%       9.04 ±  2%  perf-sched.wait_and_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
      7.06           -15.7%       5.95 ±  3%  perf-sched.wait_and_delay.avg.ms.schedule_timeout.wait_woken.sk_wait_data.tcp_recvmsg_locked
      2317           -49.6%       1169 ±  9%  perf-sched.wait_and_delay.count.__cond_resched.__release_sock.release_sock.tcp_sendmsg.__sys_sendto
      1488 ±  5%    -100.0%       0.00        perf-sched.wait_and_delay.count.__cond_resched.kmem_cache_alloc_node_noprof.__alloc_skb.tcp_stream_alloc_skb.tcp_sendmsg_locked
      1953 ±  8%    +347.2%       8733 ±  4%  perf-sched.wait_and_delay.count.__cond_resched.lock_sock_nested.tcp_recvmsg.inet_recvmsg.sock_recvmsg
      2360 ±  5%    +251.5%       8296 ±  6%  perf-sched.wait_and_delay.count.__cond_resched.lock_sock_nested.tcp_sendmsg.__sys_sendto.__x64_sys_sendto
     22781 ±  3%     +16.7%      26578 ±  3%  perf-sched.wait_and_delay.count.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
     13753 ±  4%     -14.8%      11717 ±  3%  perf-sched.wait_and_delay.count.schedule_timeout.wait_woken.sk_stream_wait_memory.tcp_sendmsg_locked
      6038 ±  2%     -12.6%       5275 ±  7%  perf-sched.wait_and_delay.count.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
     71.60 ±  8%    -100.0%       0.00        perf-sched.wait_and_delay.max.ms.__cond_resched.kmem_cache_alloc_node_noprof.__alloc_skb.tcp_stream_alloc_skb.tcp_sendmsg_locked
     53.03 ±  7%    +140.7%     127.64 ± 45%  perf-sched.wait_and_delay.max.ms.__cond_resched.lock_sock_nested.tcp_recvmsg.inet_recvmsg.sock_recvmsg
    435.94 ±122%    +263.3%       1583 ± 27%  perf-sched.wait_and_delay.max.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call
    987.24 ± 22%     +59.8%       1577 ±  6%  perf-sched.wait_and_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
      4.49 ±  4%     -16.2%       3.76 ±  6%  perf-sched.wait_time.avg.ms.__cond_resched.__release_sock.release_sock.tcp_recvmsg.inet_recvmsg
      6.77           -12.1%       5.95 ±  4%  perf-sched.wait_time.avg.ms.__cond_resched.__release_sock.release_sock.tcp_sendmsg.__sys_sendto
      4.89 ±  4%     -15.7%       4.12        perf-sched.wait_time.avg.ms.__cond_resched.lock_sock_nested.tcp_recvmsg.inet_recvmsg.sock_recvmsg
      9.97 ±  9%     -24.0%       7.58 ±  5%  perf-sched.wait_time.avg.ms.__cond_resched.lock_sock_nested.tcp_sendmsg.__sys_sendto.__x64_sys_sendto
      6.82 ±  2%      -9.6%       6.17        perf-sched.wait_time.avg.ms.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
      5.75 ± 18%     -87.3%       0.73 ±113%  perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_common_interrupt.[unknown]
      2.01 ± 14%     +38.1%       2.78 ±  7%  perf-sched.wait_time.avg.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
      2.38 ±  7%     -12.0%       2.09 ±  8%  perf-sched.wait_time.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
      5.50 ±  3%     +24.4%       6.84 ±  7%  perf-sched.wait_time.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
      3.59           -14.7%       3.06 ±  3%  perf-sched.wait_time.avg.ms.schedule_timeout.wait_woken.sk_wait_data.tcp_recvmsg_locked
     26.85 ±  6%    +311.9%     110.60 ± 63%  perf-sched.wait_time.max.ms.__cond_resched.lock_sock_nested.tcp_recvmsg.inet_recvmsg.sock_recvmsg
      6.12 ± 18%     -81.2%       1.15 ±116%  perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_common_interrupt.[unknown]
    985.54 ± 22%     +59.9%       1576 ±  6%  perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
      2411 ± 57%     -74.1%     623.65 ± 38%  perf-sched.wait_time.max.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
     17.94 ± 19%     -38.0%      11.12 ± 19%  perf-sched.wait_time.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
    249.22 ± 28%     +48.9%     371.19 ± 16%  perf-sched.wait_time.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


             reply	other threads:[~2025-08-18  4:49 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-18  4:48 kernel test robot [this message]
2025-08-18  8:39 ` [linus:master] [tcp] 1d2fbaad7c: stress-ng.sigurg.ops_per_sec 12.2% regression Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=202508180406.dbf438fc-lkp@intel.com \
    --to=oliver.sang@intel.com \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=kuniyu@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=oe-lkp@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.