From: kernel test robot <oliver.sang@intel.com>
To: Fernand Sieber <sieberf@amazon.com>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
<linux-kernel@vger.kernel.org>, <aubrey.li@linux.intel.com>,
<yu.c.chen@intel.com>, <peterz@infradead.org>,
<bsegall@google.com>, <dietmar.eggemann@arm.com>,
<dwmw@amazon.co.uk>, <graf@amazon.com>, <jschoenh@amazon.de>,
<juri.lelli@redhat.com>, <mingo@redhat.com>, <sieberf@amazon.com>,
<tanghui20@huawei.com>, <vincent.guittot@linaro.org>,
<vineethr@linux.ibm.com>, <wangtao554@huawei.com>,
<zhangqiao22@huawei.com>, <oliver.sang@intel.com>
Subject: Re: [PATCH v3] sched/fair: Forfeit vruntime on yield
Date: Fri, 26 Sep 2025 12:56:49 +0800 [thread overview]
Message-ID: <202509261113.a87577ce-lkp@intel.com> (raw)
In-Reply-To: <20250918150528.292620-1-sieberf@amazon.com>
Hello,
we reported "a 55.9% improvement of stress-ng.wait.ops_per_sec"
in https://lore.kernel.org/all/202509241501.f14b210a-lkp@intel.com/
now we noticed there is also a regression in our tests. report again FYI.
one thing we want to mention is the "stress-ng.sockpair.MB_written_per_sec" is
in "miscellaneous metrics" of this stress-ng test. for major part,
"stress-ng.sockpair.ops_per_sec", it's just a small difference.
0d4eaf8caf8cd633 15bf8c7b35e31295b26241425c0
---------------- ---------------------------
%stddev %change %stddev
\ | \
551.38 -90.5% 52.18 stress-ng.sockpair.MB_written_per_sec
781743 -2.3% 764106 stress-ng.sockpair.ops_per_sec
below is a test example for 15bf8c7b35:
2025-09-25 15:48:21 stress-ng --timeout 60 --times --verify --metrics --no-rand-seed --oom-avoid --sockpair 192
stress-ng: info: [8371] setting to a 1 min run per stressor
stress-ng: info: [8371] dispatching hogs: 192 sockpair
stress-ng: info: [8371] note: /proc/sys/kernel/sched_autogroup_enabled is 1 and this can impact scheduling throughput for processes not attached to a tty. Setting this to 0 may improve performance metrics
stress-ng: metrc: [8371] stressor bogo ops real time usr time sys time bogo ops/s bogo ops/s CPU used per RSS Max
stress-ng: metrc: [8371] (secs) (secs) (secs) (real time) (usr+sys time) instance (%) (KB)
stress-ng: metrc: [8371] sockpair 49874197 65.44 72.08 12219.54 762108.28 4057.58 97.82 3132
stress-ng: metrc: [8371] miscellaneous metrics:
stress-ng: metrc: [8371] sockpair 27717.04 socketpair calls sec (harmonic mean of 192 instances)
stress-ng: metrc: [8371] sockpair 53.01 MB written per sec (harmonic mean of 192 instances)
stress-ng: info: [8371] for a 66.13s run time:
stress-ng: info: [8371] 12696.46s available CPU time
stress-ng: info: [8371] 72.07s user time ( 0.57%)
stress-ng: info: [8371] 12219.63s system time ( 96.24%)
stress-ng: info: [8371] 12291.70s total time ( 96.81%)
stress-ng: info: [8371] load average: 190.99 57.46 19.94
stress-ng: info: [8371] skipped: 0
stress-ng: info: [8371] passed: 192: sockpair (192)
stress-ng: info: [8371] failed: 0
stress-ng: info: [8371] metrics untrustworthy: 0
stress-ng: info: [8371] successful run completed in 1 min, 6.13 secs
below is an exmple from 0d4eaf8caf:
2025-09-25 18:04:37 stress-ng --timeout 60 --times --verify --metrics --no-rand-seed --oom-avoid --sockpair 192
stress-ng: info: [8360] setting to a 1 min run per stressor
stress-ng: info: [8360] dispatching hogs: 192 sockpair
stress-ng: info: [8360] note: /proc/sys/kernel/sched_autogroup_enabled is 1 and this can impact scheduling throughput for processes not attached to a tty. Setting this to 0 may improve performance metrics
stress-ng: metrc: [8360] stressor bogo ops real time usr time sys time bogo ops/s bogo ops/s CPU used per RSS Max
stress-ng: metrc: [8360] (secs) (secs) (secs) (real time) (usr+sys time) instance (%) (KB)
stress-ng: metrc: [8360] sockpair 51705787 65.08 56.75 12254.39 794448.25 4199.92 98.52 5160
stress-ng: metrc: [8360] miscellaneous metrics:
stress-ng: metrc: [8360] sockpair 28156.62 socketpair calls sec (harmonic mean of 192 instances)
stress-ng: metrc: [8360] sockpair 562.18 MB written per sec (harmonic mean of 192 instances)
stress-ng: info: [8360] for a 65.40s run time:
stress-ng: info: [8360] 12556.08s available CPU time
stress-ng: info: [8360] 56.75s user time ( 0.45%)
stress-ng: info: [8360] 12254.48s system time ( 97.60%)
stress-ng: info: [8360] 12311.23s total time ( 98.05%)
stress-ng: info: [8360] load average: 239.81 72.31 25.10
stress-ng: info: [8360] skipped: 0
stress-ng: info: [8360] passed: 192: sockpair (192)
stress-ng: info: [8360] failed: 0
stress-ng: info: [8360] metrics untrustworthy: 0
stress-ng: info: [8360] successful run completed in 1 min, 5.40 secs
below is full report.
kernel test robot noticed a 90.5% regression of stress-ng.sockpair.MB_written_per_sec on:
commit: 15bf8c7b35e31295b26241425c0a61102e92109f ("[PATCH v3] sched/fair: Forfeit vruntime on yield")
url: https://github.com/intel-lab-lkp/linux/commits/Fernand-Sieber/sched-fair-Forfeit-vruntime-on-yield/20250918-231320
base: https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git 0d4eaf8caf8cd633b23e949e2996b420052c2d45
patch link: https://lore.kernel.org/all/20250918150528.292620-1-sieberf@amazon.com/
patch subject: [PATCH v3] sched/fair: Forfeit vruntime on yield
testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-14
test machine: 192 threads 2 sockets Intel(R) Xeon(R) 6740E CPU @ 2.4GHz (Sierra Forest) with 256G memory
parameters:
nr_threads: 100%
testtime: 60s
test: sockpair
cpufreq_governor: performance
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202509261113.a87577ce-lkp@intel.com
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250926/202509261113.a87577ce-lkp@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-14/performance/x86_64-rhel-9.4/100%/debian-13-x86_64-20250902.cgz/lkp-srf-2sp3/sockpair/stress-ng/60s
commit:
0d4eaf8caf ("sched/fair: Do not balance task to a throttled cfs_rq")
15bf8c7b35 ("sched/fair: Forfeit vruntime on yield")
0d4eaf8caf8cd633 15bf8c7b35e31295b26241425c0
---------------- ---------------------------
%stddev %change %stddev
\ | \
0.78 ± 2% +0.2 1.02 mpstat.cpu.all.usr%
19.57 -36.8% 12.36 ± 70% turbostat.RAMWatt
4.073e+08 ± 6% +23.1% 5.013e+08 ± 5% cpuidle..time
266261 ± 9% +46.4% 389733 ± 9% cpuidle..usage
451887 ± 77% +160.9% 1178929 ± 33% numa-vmstat.node0.nr_file_pages
192819 ± 30% +101.3% 388191 ± 43% numa-vmstat.node1.nr_shmem
1807416 ± 77% +161.0% 4716665 ± 33% numa-meminfo.node0.FilePages
8980121 -9.0% 8174177 numa-meminfo.node0.SUnreclaim
25356157 ± 8% -22.0% 19772595 ± 9% numa-meminfo.node1.MemUsed
771480 ± 30% +101.4% 1553932 ± 43% numa-meminfo.node1.Shmem
551.38 -90.5% 52.18 stress-ng.sockpair.MB_written_per_sec
51092272 -2.2% 49968621 stress-ng.sockpair.ops
781743 -2.3% 764106 stress-ng.sockpair.ops_per_sec
21418332 ± 4% +69.2% 36232510 stress-ng.time.involuntary_context_switches
56.36 +27.4% 71.81 stress-ng.time.user_time
150809 ± 21% +17217.1% 26115838 ± 3% stress-ng.time.voluntary_context_switches
2165914 ± 7% +92.3% 4165197 ± 4% meminfo.Active
2165898 ± 7% +92.3% 4165181 ± 4% meminfo.Active(anon)
4926568 +39.6% 6875228 meminfo.Cached
6826363 +28.1% 8744371 meminfo.Committed_AS
513281 ± 8% +98.7% 1019681 ± 6% meminfo.Mapped
48472806 ± 2% -14.8% 41314088 meminfo.Memused
1276164 +152.7% 3224818 ± 3% meminfo.Shmem
53022761 ± 2% -15.7% 44672632 meminfo.max_used_kB
0.53 -81.0% 0.10 ± 4% perf-sched.sch_delay.avg.ms.[unknown].[unknown].[unknown].[unknown].[unknown]
0.53 -81.0% 0.10 ± 4% perf-sched.total_sch_delay.average.ms
2.03 -68.4% 0.64 ± 4% perf-sched.total_wait_and_delay.average.ms
1811449 +200.9% 5449776 ± 4% perf-sched.total_wait_and_delay.count.ms
1.50 -64.0% 0.54 ± 4% perf-sched.total_wait_time.average.ms
2.03 -68.4% 0.64 ± 4% perf-sched.wait_and_delay.avg.ms.[unknown].[unknown].[unknown].[unknown].[unknown]
1811449 +200.9% 5449776 ± 4% perf-sched.wait_and_delay.count.[unknown].[unknown].[unknown].[unknown].[unknown]
1.50 -64.0% 0.54 ± 4% perf-sched.wait_time.avg.ms.[unknown].[unknown].[unknown].[unknown].[unknown]
541937 ± 7% +92.5% 1043389 ± 4% proc-vmstat.nr_active_anon
5242293 +3.5% 5423918 proc-vmstat.nr_dirty_background_threshold
10497404 +3.5% 10861099 proc-vmstat.nr_dirty_threshold
1232280 +39.7% 1721251 proc-vmstat.nr_file_pages
52782357 +3.4% 54601330 proc-vmstat.nr_free_pages
52117733 +3.8% 54073313 proc-vmstat.nr_free_pages_blocks
128259 ± 8% +100.8% 257594 ± 6% proc-vmstat.nr_mapped
319681 +153.0% 808650 ± 3% proc-vmstat.nr_shmem
4489133 -8.9% 4089704 proc-vmstat.nr_slab_unreclaimable
541937 ± 7% +92.5% 1043389 ± 4% proc-vmstat.nr_zone_active_anon
77303955 +2.5% 79201972 proc-vmstat.pgalloc_normal
519724 +5.2% 546556 proc-vmstat.pgfault
76456707 +1.7% 77739095 proc-vmstat.pgfree
12794131 ± 6% -27.4% 9288185 sched_debug.cfs_rq:/.avg_vruntime.max
4610143 ± 8% -14.9% 3923890 ± 5% sched_debug.cfs_rq:/.avg_vruntime.min
1.03 -20.1% 0.83 ± 2% sched_debug.cfs_rq:/.h_nr_queued.avg
1.03 -20.8% 0.82 ± 2% sched_debug.cfs_rq:/.h_nr_runnable.avg
895.00 ± 70% +89.0% 1691 ± 2% sched_debug.cfs_rq:/.load.min
0.67 ± 55% +125.0% 1.50 sched_debug.cfs_rq:/.load_avg.min
12794131 ± 6% -27.4% 9288185 sched_debug.cfs_rq:/.min_vruntime.max
4610143 ± 8% -14.9% 3923896 ± 5% sched_debug.cfs_rq:/.min_vruntime.min
1103 -20.2% 880.86 sched_debug.cfs_rq:/.runnable_avg.avg
428.26 ± 6% -63.4% 156.94 ± 22% sched_debug.cfs_rq:/.util_est.avg
1775 ± 6% -39.3% 1077 ± 15% sched_debug.cfs_rq:/.util_est.max
396.33 ± 6% -50.0% 198.03 ± 17% sched_debug.cfs_rq:/.util_est.stddev
50422 ± 6% -34.7% 32915 ± 18% sched_debug.cpu.avg_idle.min
456725 ± 10% +39.4% 636811 ± 4% sched_debug.cpu.avg_idle.stddev
611566 ± 5% +25.0% 764424 ± 2% sched_debug.cpu.max_idle_balance_cost.avg
190657 ± 12% +36.1% 259410 ± 5% sched_debug.cpu.max_idle_balance_cost.stddev
1.04 -20.4% 0.82 ± 2% sched_debug.cpu.nr_running.avg
57214 ± 4% +183.5% 162228 ± 2% sched_debug.cpu.nr_switches.avg
253314 ± 4% +39.3% 352777 ± 4% sched_debug.cpu.nr_switches.max
59410 ± 6% +31.6% 78186 ± 10% sched_debug.cpu.nr_switches.stddev
3.33 -27.9% 2.40 perf-stat.i.MPKI
1.207e+10 +11.3% 1.344e+10 perf-stat.i.branch-instructions
0.21 ± 7% +0.0 0.24 ± 5% perf-stat.i.branch-miss-rate%
23462655 ± 6% +27.4% 29896517 ± 3% perf-stat.i.branch-misses
75.74 -4.4 71.33 perf-stat.i.cache-miss-rate%
1.861e+08 -21.5% 1.462e+08 perf-stat.i.cache-misses
2.435e+08 -17.1% 2.017e+08 perf-stat.i.cache-references
323065 ± 5% +191.4% 941425 ± 2% perf-stat.i.context-switches
10.73 -9.7% 9.69 perf-stat.i.cpi
353.45 +39.0% 491.13 ± 4% perf-stat.i.cpu-migrations
3589 +30.5% 4685 perf-stat.i.cycles-between-cache-misses
5.645e+10 +12.0% 6.323e+10 perf-stat.i.instructions
0.09 +12.1% 0.11 perf-stat.i.ipc
1.66 ± 5% +193.9% 4.89 ± 2% perf-stat.i.metric.K/sec
6247 +5.7% 6603 ± 2% perf-stat.i.minor-faults
6248 +5.7% 6604 ± 2% perf-stat.i.page-faults
3.33 -29.7% 2.34 perf-stat.overall.MPKI
0.20 ± 7% +0.0 0.23 ± 4% perf-stat.overall.branch-miss-rate%
76.67 -3.9 72.79 perf-stat.overall.cache-miss-rate%
10.54 -11.1% 9.37 perf-stat.overall.cpi
3168 +26.5% 4007 perf-stat.overall.cycles-between-cache-misses
0.09 +12.5% 0.11 perf-stat.overall.ipc
1.204e+10 +11.1% 1.337e+10 perf-stat.ps.branch-instructions
23586580 ± 7% +29.7% 30600100 ± 4% perf-stat.ps.branch-misses
1.873e+08 -21.4% 1.471e+08 perf-stat.ps.cache-misses
2.443e+08 -17.3% 2.021e+08 perf-stat.ps.cache-references
324828 ± 5% +187.0% 932274 ± 2% perf-stat.ps.context-switches
335.13 ± 2% +41.7% 474.95 ± 5% perf-stat.ps.cpu-migrations
5.632e+10 +11.7% 6.293e+10 perf-stat.ps.instructions
6282 +6.5% 6690 ± 2% perf-stat.ps.minor-faults
6284 +6.5% 6692 ± 2% perf-stat.ps.page-faults
3.764e+12 +12.2% 4.224e+12 perf-stat.total.instructions
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
next prev parent reply other threads:[~2025-09-26 4:57 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-11 9:51 [PATCH RESEND] sched/fair: Only increment deadline once on yield Fernand Sieber
2025-09-11 11:03 ` Alexander Graf
2025-09-11 11:37 ` Peter Zijlstra
2025-09-11 13:56 ` Peter Zijlstra
2025-09-16 13:35 ` Fernand Sieber
2025-09-16 14:02 ` [PATCH v2] sched/fair: Forfeit vruntime " Fernand Sieber
2025-09-16 16:00 ` Fernand Sieber
2025-09-18 6:43 ` Peter Zijlstra
2025-09-18 10:21 ` Peter Zijlstra
2025-09-18 15:05 ` [PATCH v3] " Fernand Sieber
2025-09-24 8:25 ` kernel test robot
2025-09-26 4:56 ` kernel test robot [this message]
2025-10-16 9:33 ` [tip: sched/core] " tip-bot2 for Fernand Sieber
2025-11-05 9:13 ` [PATCH v4] " Fernand Sieber
2025-09-17 19:22 ` [PATCH v2] " Fernand Sieber
2025-09-18 2:45 ` Xuewen Yan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=202509261113.a87577ce-lkp@intel.com \
--to=oliver.sang@intel.com \
--cc=aubrey.li@linux.intel.com \
--cc=bsegall@google.com \
--cc=dietmar.eggemann@arm.com \
--cc=dwmw@amazon.co.uk \
--cc=graf@amazon.com \
--cc=jschoenh@amazon.de \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lkp@intel.com \
--cc=mingo@redhat.com \
--cc=oe-lkp@lists.linux.dev \
--cc=peterz@infradead.org \
--cc=sieberf@amazon.com \
--cc=tanghui20@huawei.com \
--cc=vincent.guittot@linaro.org \
--cc=vineethr@linux.ibm.com \
--cc=wangtao554@huawei.com \
--cc=yu.c.chen@intel.com \
--cc=zhangqiao22@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.