* [linus:master] [net] 16c610162d: netperf.Throughput_tps 17.2% regression
@ 2025-10-28 6:25 kernel test robot
2025-10-28 6:57 ` Eric Dumazet
0 siblings, 1 reply; 2+ messages in thread
From: kernel test robot @ 2025-10-28 6:25 UTC (permalink / raw)
To: Eric Dumazet
Cc: oe-lkp, lkp, linux-kernel, Jakub Kicinski, Kuniyuki Iwashima,
netdev, oliver.sang
Hello,
kernel test robot noticed a 17.2% regression of netperf.Throughput_tps on:
commit: 16c610162d1f1c332209de1c91ffb09b659bb65d ("net: call cond_resched() less often in __release_sock()")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
[still regression on linus/master dcb6fa37fd7bc9c3d2b066329b0d27dedf8becaa]
[still regression on linux-next/master 8fec172c82c2b5f6f8e47ab837c1dc91ee3d1b87]
testcase: netperf
config: x86_64-rhel-9.4
compiler: gcc-14
test machine: 192 threads 2 sockets Intel(R) Xeon(R) 6740E CPU @ 2.4GHz (Sierra Forest) with 256G memory
parameters:
ip: ipv4
runtime: 300s
nr_threads: 200%
cluster: cs-localhost
test: TCP_CRR
cpufreq_governor: performance
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202510281337.398a9aa9-lkp@intel.com
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20251028/202510281337.398a9aa9-lkp@intel.com
=========================================================================================
cluster/compiler/cpufreq_governor/ip/kconfig/nr_threads/rootfs/runtime/tbox_group/test/testcase:
cs-localhost/gcc-14/performance/ipv4/x86_64-rhel-9.4/200%/debian-13-x86_64-20250902.cgz/300s/lkp-srf-2sp3/TCP_CRR/netperf
commit:
abfa70b380 ("Merge branch 'tcp-__tcp_close-changes'")
16c610162d ("net: call cond_resched() less often in __release_sock()")
abfa70b380348cf4 16c610162d1f1c332209de1c91f
---------------- ---------------------------
%stddev %change %stddev
\ | \
2.80 -0.4 2.43 ± 3% mpstat.cpu.all.usr%
199581 ± 96% -75.4% 49072 ± 64% numa-meminfo.node0.Mapped
6583442 ± 6% -30.2% 4594175 ± 5% numa-numastat.node0.local_node
6709344 ± 6% -30.4% 4672973 ± 5% numa-numastat.node0.numa_hit
50277 ± 96% -75.4% 12383 ± 63% numa-vmstat.node0.nr_mapped
6708267 ± 6% -30.3% 4672365 ± 5% numa-vmstat.node0.numa_hit
6582364 ± 6% -30.2% 4593568 ± 5% numa-vmstat.node0.numa_local
224.83 ±100% +224.8% 730.17 ± 36% perf-c2c.DRAM.local
1438 ±100% +132.4% 3343 ± 11% perf-c2c.DRAM.remote
1569 ±100% +115.5% 3383 ± 10% perf-c2c.HITM.local
1089 ±100% +121.1% 2408 ± 10% perf-c2c.HITM.remote
14776381 ± 9% -21.6% 11587148 ± 8% proc-vmstat.numa_hit
14576750 ± 9% -21.9% 11387471 ± 8% proc-vmstat.numa_local
51492399 ± 6% -26.1% 38054262 ± 5% proc-vmstat.pgalloc_normal
48277971 ± 5% -26.9% 35310227 ± 5% proc-vmstat.pgfree
2874230 -17.2% 2379822 netperf.ThroughputBoth_total_tps
7484 -17.2% 6197 netperf.ThroughputBoth_tps
2874230 -17.2% 2379822 netperf.Throughput_total_tps
7484 -17.2% 6197 netperf.Throughput_tps
1.351e+09 -13.7% 1.165e+09 netperf.time.involuntary_context_switches
9145 +7.8% 9855 netperf.time.percent_of_cpu_this_job_got
27055 +8.4% 29322 netperf.time.system_time
927.87 -11.1% 824.49 netperf.time.user_time
1.975e+08 ± 5% -28.2% 1.418e+08 ± 6% netperf.time.voluntary_context_switches
8.623e+08 -17.2% 7.139e+08 netperf.workload
7908218 ± 8% +33.3% 10540980 ± 7% sched_debug.cfs_rq:/.avg_vruntime.stddev
2.27 -10.2% 2.04 sched_debug.cfs_rq:/.h_nr_queued.avg
11.92 ± 7% -18.9% 9.67 ± 8% sched_debug.cfs_rq:/.h_nr_queued.max
2.33 ± 5% -13.6% 2.02 ± 4% sched_debug.cfs_rq:/.h_nr_queued.stddev
5.14 ± 27% -50.8% 2.53 ± 51% sched_debug.cfs_rq:/.load_avg.min
7908224 ± 8% +33.3% 10540996 ± 7% sched_debug.cfs_rq:/.min_vruntime.stddev
245718 ± 4% -10.4% 220184 ± 8% sched_debug.cpu.max_idle_balance_cost.stddev
2.26 -10.2% 2.03 sched_debug.cpu.nr_running.avg
2.33 ± 5% -13.8% 2.01 ± 4% sched_debug.cpu.nr_running.stddev
8021905 -16.0% 6738879 sched_debug.cpu.nr_switches.avg
10163286 -20.5% 8082726 ± 2% sched_debug.cpu.nr_switches.max
1494738 ± 14% -50.1% 745542 ± 9% sched_debug.cpu.nr_switches.stddev
6.417e+10 -16.1% 5.383e+10 perf-stat.i.branch-instructions
0.52 -0.0 0.49 perf-stat.i.branch-miss-rate%
3.329e+08 -21.1% 2.628e+08 perf-stat.i.branch-misses
49601635 ± 8% -15.1% 42090142 ± 6% perf-stat.i.cache-misses
2.238e+08 -11.6% 1.979e+08 ± 2% perf-stat.i.cache-references
10160912 -15.7% 8567209 perf-stat.i.context-switches
1.74 +20.0% 2.09 perf-stat.i.cpi
2679 ± 7% -22.9% 2067 ± 3% perf-stat.i.cpu-migrations
12544 ± 7% +17.2% 14707 ± 5% perf-stat.i.cycles-between-cache-misses
3.464e+11 -16.3% 2.898e+11 perf-stat.i.instructions
0.58 -16.4% 0.49 perf-stat.i.ipc
52.92 -15.7% 44.62 perf-stat.i.metric.K/sec
0.52 -0.0 0.49 perf-stat.overall.branch-miss-rate%
1.74 +19.4% 2.07 perf-stat.overall.cpi
12209 ± 8% +17.3% 14320 ± 6% perf-stat.overall.cycles-between-cache-misses
0.58 -16.3% 0.48 perf-stat.overall.ipc
122980 +1.1% 124361 perf-stat.overall.path-length
6.398e+10 -16.1% 5.367e+10 perf-stat.ps.branch-instructions
3.319e+08 -21.1% 2.62e+08 perf-stat.ps.branch-misses
49465671 ± 8% -15.1% 41971976 ± 6% perf-stat.ps.cache-misses
2.231e+08 -11.6% 1.973e+08 ± 2% perf-stat.ps.cache-references
10129507 -15.7% 8540638 perf-stat.ps.context-switches
2669 ± 7% -22.8% 2061 ± 3% perf-stat.ps.cpu-migrations
3.454e+11 -16.3% 2.89e+11 perf-stat.ps.instructions
1.06e+14 -16.3% 8.879e+13 perf-stat.total.instructions
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [linus:master] [net] 16c610162d: netperf.Throughput_tps 17.2% regression
2025-10-28 6:25 [linus:master] [net] 16c610162d: netperf.Throughput_tps 17.2% regression kernel test robot
@ 2025-10-28 6:57 ` Eric Dumazet
0 siblings, 0 replies; 2+ messages in thread
From: Eric Dumazet @ 2025-10-28 6:57 UTC (permalink / raw)
To: kernel test robot
Cc: oe-lkp, lkp, linux-kernel, Jakub Kicinski, Kuniyuki Iwashima,
netdev
On Mon, Oct 27, 2025 at 11:26 PM kernel test robot
<oliver.sang@intel.com> wrote:
>
>
>
> Hello,
>
> kernel test robot noticed a 17.2% regression of netperf.Throughput_tps on:
>
>
> commit: 16c610162d1f1c332209de1c91ffb09b659bb65d ("net: call cond_resched() less often in __release_sock()")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>
> [still regression on linus/master dcb6fa37fd7bc9c3d2b066329b0d27dedf8becaa]
> [still regression on linux-next/master 8fec172c82c2b5f6f8e47ab837c1dc91ee3d1b87]
>
> testcase: netperf
> config: x86_64-rhel-9.4
> compiler: gcc-14
> test machine: 192 threads 2 sockets Intel(R) Xeon(R) 6740E CPU @ 2.4GHz (Sierra Forest) with 256G memory
> parameters:
>
> ip: ipv4
> runtime: 300s
> nr_threads: 200%
> cluster: cs-localhost
> test: TCP_CRR
> cpufreq_governor: performance
>
>
>
I will not consider this as a regression.
If anyone is interested, they would have to investigate if TCP_CRR on
localhost is
a really interesting metric, and why this would depend on
cond_resched() in __release_sock()
Thank you.
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <oliver.sang@intel.com>
> | Closes: https://lore.kernel.org/oe-lkp/202510281337.398a9aa9-lkp@intel.com
>
>
> Details are as below:
> -------------------------------------------------------------------------------------------------->
>
>
> The kernel config and materials to reproduce are available at:
> https://download.01.org/0day-ci/archive/20251028/202510281337.398a9aa9-lkp@intel.com
>
> =========================================================================================
> cluster/compiler/cpufreq_governor/ip/kconfig/nr_threads/rootfs/runtime/tbox_group/test/testcase:
> cs-localhost/gcc-14/performance/ipv4/x86_64-rhel-9.4/200%/debian-13-x86_64-20250902.cgz/300s/lkp-srf-2sp3/TCP_CRR/netperf
>
> commit:
> abfa70b380 ("Merge branch 'tcp-__tcp_close-changes'")
> 16c610162d ("net: call cond_resched() less often in __release_sock()")
>
> abfa70b380348cf4 16c610162d1f1c332209de1c91f
> ---------------- ---------------------------
> %stddev %change %stddev
> \ | \
> 2.80 -0.4 2.43 ą 3% mpstat.cpu.all.usr%
> 199581 ą 96% -75.4% 49072 ą 64% numa-meminfo.node0.Mapped
> 6583442 ą 6% -30.2% 4594175 ą 5% numa-numastat.node0.local_node
> 6709344 ą 6% -30.4% 4672973 ą 5% numa-numastat.node0.numa_hit
> 50277 ą 96% -75.4% 12383 ą 63% numa-vmstat.node0.nr_mapped
> 6708267 ą 6% -30.3% 4672365 ą 5% numa-vmstat.node0.numa_hit
> 6582364 ą 6% -30.2% 4593568 ą 5% numa-vmstat.node0.numa_local
> 224.83 ą100% +224.8% 730.17 ą 36% perf-c2c.DRAM.local
> 1438 ą100% +132.4% 3343 ą 11% perf-c2c.DRAM.remote
> 1569 ą100% +115.5% 3383 ą 10% perf-c2c.HITM.local
> 1089 ą100% +121.1% 2408 ą 10% perf-c2c.HITM.remote
> 14776381 ą 9% -21.6% 11587148 ą 8% proc-vmstat.numa_hit
> 14576750 ą 9% -21.9% 11387471 ą 8% proc-vmstat.numa_local
> 51492399 ą 6% -26.1% 38054262 ą 5% proc-vmstat.pgalloc_normal
> 48277971 ą 5% -26.9% 35310227 ą 5% proc-vmstat.pgfree
> 2874230 -17.2% 2379822 netperf.ThroughputBoth_total_tps
> 7484 -17.2% 6197 netperf.ThroughputBoth_tps
> 2874230 -17.2% 2379822 netperf.Throughput_total_tps
> 7484 -17.2% 6197 netperf.Throughput_tps
> 1.351e+09 -13.7% 1.165e+09 netperf.time.involuntary_context_switches
> 9145 +7.8% 9855 netperf.time.percent_of_cpu_this_job_got
> 27055 +8.4% 29322 netperf.time.system_time
> 927.87 -11.1% 824.49 netperf.time.user_time
> 1.975e+08 ą 5% -28.2% 1.418e+08 ą 6% netperf.time.voluntary_context_switches
> 8.623e+08 -17.2% 7.139e+08 netperf.workload
> 7908218 ą 8% +33.3% 10540980 ą 7% sched_debug.cfs_rq:/.avg_vruntime.stddev
> 2.27 -10.2% 2.04 sched_debug.cfs_rq:/.h_nr_queued.avg
> 11.92 ą 7% -18.9% 9.67 ą 8% sched_debug.cfs_rq:/.h_nr_queued.max
> 2.33 ą 5% -13.6% 2.02 ą 4% sched_debug.cfs_rq:/.h_nr_queued.stddev
> 5.14 ą 27% -50.8% 2.53 ą 51% sched_debug.cfs_rq:/.load_avg.min
> 7908224 ą 8% +33.3% 10540996 ą 7% sched_debug.cfs_rq:/.min_vruntime.stddev
> 245718 ą 4% -10.4% 220184 ą 8% sched_debug.cpu.max_idle_balance_cost.stddev
> 2.26 -10.2% 2.03 sched_debug.cpu.nr_running.avg
> 2.33 ą 5% -13.8% 2.01 ą 4% sched_debug.cpu.nr_running.stddev
> 8021905 -16.0% 6738879 sched_debug.cpu.nr_switches.avg
> 10163286 -20.5% 8082726 ą 2% sched_debug.cpu.nr_switches.max
> 1494738 ą 14% -50.1% 745542 ą 9% sched_debug.cpu.nr_switches.stddev
> 6.417e+10 -16.1% 5.383e+10 perf-stat.i.branch-instructions
> 0.52 -0.0 0.49 perf-stat.i.branch-miss-rate%
> 3.329e+08 -21.1% 2.628e+08 perf-stat.i.branch-misses
> 49601635 ą 8% -15.1% 42090142 ą 6% perf-stat.i.cache-misses
> 2.238e+08 -11.6% 1.979e+08 ą 2% perf-stat.i.cache-references
> 10160912 -15.7% 8567209 perf-stat.i.context-switches
> 1.74 +20.0% 2.09 perf-stat.i.cpi
> 2679 ą 7% -22.9% 2067 ą 3% perf-stat.i.cpu-migrations
> 12544 ą 7% +17.2% 14707 ą 5% perf-stat.i.cycles-between-cache-misses
> 3.464e+11 -16.3% 2.898e+11 perf-stat.i.instructions
> 0.58 -16.4% 0.49 perf-stat.i.ipc
> 52.92 -15.7% 44.62 perf-stat.i.metric.K/sec
> 0.52 -0.0 0.49 perf-stat.overall.branch-miss-rate%
> 1.74 +19.4% 2.07 perf-stat.overall.cpi
> 12209 ą 8% +17.3% 14320 ą 6% perf-stat.overall.cycles-between-cache-misses
> 0.58 -16.3% 0.48 perf-stat.overall.ipc
> 122980 +1.1% 124361 perf-stat.overall.path-length
> 6.398e+10 -16.1% 5.367e+10 perf-stat.ps.branch-instructions
> 3.319e+08 -21.1% 2.62e+08 perf-stat.ps.branch-misses
> 49465671 ą 8% -15.1% 41971976 ą 6% perf-stat.ps.cache-misses
> 2.231e+08 -11.6% 1.973e+08 ą 2% perf-stat.ps.cache-references
> 10129507 -15.7% 8540638 perf-stat.ps.context-switches
> 2669 ą 7% -22.8% 2061 ą 3% perf-stat.ps.cpu-migrations
> 3.454e+11 -16.3% 2.89e+11 perf-stat.ps.instructions
> 1.06e+14 -16.3% 8.879e+13 perf-stat.total.instructions
>
>
>
>
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
>
>
> --
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki
>
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2025-10-28 6:57 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-28 6:25 [linus:master] [net] 16c610162d: netperf.Throughput_tps 17.2% regression kernel test robot
2025-10-28 6:57 ` Eric Dumazet
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).