From: kernel test robot <oliver.sang@intel.com>
To: NeilBrown <neilb@suse.de>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
<linux-nfs@vger.kernel.org>, <ying.huang@intel.com>,
<feng.tang@intel.com>, <fengwei.yin@intel.com>,
"Chuck Lever" <chuck.lever@oracle.com>,
Jeff Layton <jlayton@kernel.org>,
"Olga Kornievskaia" <kolga@netapp.com>,
Dai Ngo <Dai.Ngo@oracle.com>, Tom Talpey <tom@talpey.com>,
<oliver.sang@intel.com>
Subject: Re: [PATCH 6/6] sunrpc: introduce possibility that requested number of threads is different from actual
Date: Wed, 30 Oct 2024 14:35:19 +0800 [thread overview]
Message-ID: <202410301321.d8aebe67-oliver.sang@intel.com> (raw)
In-Reply-To: <20241023024222.691745-7-neilb@suse.de>
Hello,
kernel test robot noticed a 10.2% regression of fsmark.files_per_sec on:
commit: d7f6562adeebe62458eb11437b260d3f470849cd ("[PATCH 6/6] sunrpc: introduce possibility that requested number of threads is different from actual")
url: https://github.com/intel-lab-lkp/linux/commits/NeilBrown/SUNRPC-move-nrthreads-counting-to-start-stop-threads/20241023-104539
base: git://git.linux-nfs.org/projects/trondmy/linux-nfs.git linux-next
patch link: https://lore.kernel.org/all/20241023024222.691745-7-neilb@suse.de/
patch subject: [PATCH 6/6] sunrpc: introduce possibility that requested number of threads is different from actual
testcase: fsmark
config: x86_64-rhel-8.3
compiler: gcc-12
test machine: 48 threads 2 sockets Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz (Ivy Bridge-EP) with 64G memory
parameters:
iterations: 1x
nr_threads: 32t
disk: 1SSD
fs: btrfs
fs2: nfsv4
filesize: 8K
test_size: 400M
sync_method: fsyncBeforeClose
nr_directories: 16d
nr_files_per_directory: 256fpd
cpufreq_governor: performance
In addition to that, the commit also has significant impact on the following tests:
+------------------+------------------------------------------------------------------------------------------------+
| testcase: change | fsmark: fsmark.files_per_sec 10.2% regression |
| test machine | 48 threads 2 sockets Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz (Ivy Bridge-EP) with 64G memory |
| test parameters | cpufreq_governor=performance |
| | disk=1SSD |
| | filesize=9B |
| | fs2=nfsv4 |
| | fs=btrfs |
| | iterations=1x |
| | nr_directories=16d |
| | nr_files_per_directory=256fpd |
| | nr_threads=32t |
| | sync_method=fsyncBeforeClose |
| | test_size=400M |
+------------------+------------------------------------------------------------------------------------------------+
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202410301321.d8aebe67-oliver.sang@intel.com
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20241030/202410301321.d8aebe67-oliver.sang@intel.com
=========================================================================================
compiler/cpufreq_governor/disk/filesize/fs2/fs/iterations/kconfig/nr_directories/nr_files_per_directory/nr_threads/rootfs/sync_method/tbox_group/test_size/testcase:
gcc-12/performance/1SSD/8K/nfsv4/btrfs/1x/x86_64-rhel-8.3/16d/256fpd/32t/debian-12-x86_64-20240206.cgz/fsyncBeforeClose/lkp-ivb-2ep2/400M/fsmark
commit:
4e9c43765c ("sunrpc: remove all connection limit configuration")
d7f6562ade ("sunrpc: introduce possibility that requested number of threads is different from actual")
4e9c43765c3fd361 d7f6562adeebe62458eb11437b2
---------------- ---------------------------
%stddev %change %stddev
\ | \
0.09 ± 78% +173.9% 0.24 ± 29% perf-stat.i.major-faults
0.07 ± 66% -61.1% 0.03 ± 23% perf-sched.sch_delay.avg.ms.do_wait.kernel_wait4.__do_sys_wait4.do_syscall_64
0.02 ± 88% +1806.2% 0.46 ±196% perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown]
-17.50 -31.4% -12.00 sched_debug.cpu.nr_uninterruptible.min
5.32 ± 10% -11.9% 4.69 ± 8% sched_debug.cpu.nr_uninterruptible.stddev
76584407 ± 5% +40.6% 1.077e+08 ± 4% fsmark.app_overhead
3016 ± 2% -10.2% 2707 fsmark.files_per_sec
36.83 -7.7% 34.00 fsmark.time.percent_of_cpu_this_job_got
285398 ± 4% +10.5% 315320 meminfo.Active
282671 ± 4% +10.6% 312596 meminfo.Active(file)
15709 ± 11% -28.8% 11189 ± 2% meminfo.Dirty
70764 ± 4% +10.6% 78241 proc-vmstat.nr_active_file
468998 +5.4% 494463 proc-vmstat.nr_dirtied
3931 ± 11% -28.8% 2797 ± 2% proc-vmstat.nr_dirty
462247 +6.7% 493402 proc-vmstat.nr_written
70764 ± 4% +10.6% 78241 proc-vmstat.nr_zone_active_file
3620 ± 10% -30.7% 2507 ± 5% proc-vmstat.nr_zone_write_pending
1135721 +2.6% 1165013 proc-vmstat.numa_hit
1086049 +2.7% 1115353 proc-vmstat.numa_local
56063 ± 2% +18.0% 66177 proc-vmstat.pgactivate
1521044 +2.3% 1555275 proc-vmstat.pgalloc_normal
2522626 +10.2% 2778923 proc-vmstat.pgpgout
3.75 ± 26% -1.9 1.80 ± 7% perf-profile.calltrace.cycles-pp.__x64_sys_exit_group.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe
3.75 ± 26% -1.9 1.80 ± 7% perf-profile.calltrace.cycles-pp.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe
2.54 ± 39% -1.7 0.89 ±100% perf-profile.calltrace.cycles-pp.event_function_call.perf_event_release_kernel.perf_release.__fput.task_work_run
2.54 ± 39% -1.7 0.89 ±100% perf-profile.calltrace.cycles-pp.smp_call_function_single.event_function_call.perf_event_release_kernel.perf_release.__fput
0.61 ±141% +1.5 2.08 ± 25% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.write.writen.record__pushfn
0.61 ±141% +1.5 2.08 ± 25% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.write.writen.record__pushfn.perf_mmap__push
0.61 ±141% +1.5 2.08 ± 25% perf-profile.calltrace.cycles-pp.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write.do_syscall_64
0.61 ±141% +1.5 2.08 ± 25% perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write.writen
0.61 ±141% +1.5 2.08 ± 25% perf-profile.calltrace.cycles-pp.shmem_file_write_iter.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.61 ±141% +1.5 2.08 ± 25% perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
0.51 ±141% +1.6 2.09 ± 29% perf-profile.calltrace.cycles-pp.evlist_cpu_iterator__next.__evlist__disable.__cmd_record.cmd_record.run_builtin
0.81 ±100% +2.2 3.01 ± 44% perf-profile.calltrace.cycles-pp.__evlist__disable.__cmd_record.cmd_record.run_builtin.handle_internal_command
3.11 ± 27% +2.6 5.74 ± 24% perf-profile.calltrace.cycles-pp.__cmd_record.cmd_record.run_builtin.handle_internal_command.main
3.11 ± 27% +2.6 5.74 ± 24% perf-profile.calltrace.cycles-pp.cmd_record.run_builtin.handle_internal_command.main
0.61 ±141% +1.5 2.08 ± 25% perf-profile.children.cycles-pp.generic_perform_write
0.61 ±141% +1.5 2.08 ± 25% perf-profile.children.cycles-pp.ksys_write
0.61 ±141% +1.5 2.08 ± 25% perf-profile.children.cycles-pp.shmem_file_write_iter
0.61 ±141% +1.5 2.08 ± 25% perf-profile.children.cycles-pp.vfs_write
0.81 ±100% +2.2 3.01 ± 44% perf-profile.children.cycles-pp.__evlist__disable
***************************************************************************************************
lkp-ivb-2ep2: 48 threads 2 sockets Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz (Ivy Bridge-EP) with 64G memory
=========================================================================================
compiler/cpufreq_governor/disk/filesize/fs2/fs/iterations/kconfig/nr_directories/nr_files_per_directory/nr_threads/rootfs/sync_method/tbox_group/test_size/testcase:
gcc-12/performance/1SSD/9B/nfsv4/btrfs/1x/x86_64-rhel-8.3/16d/256fpd/32t/debian-12-x86_64-20240206.cgz/fsyncBeforeClose/lkp-ivb-2ep2/400M/fsmark
commit:
4e9c43765c ("sunrpc: remove all connection limit configuration")
d7f6562ade ("sunrpc: introduce possibility that requested number of threads is different from actual")
4e9c43765c3fd361 d7f6562adeebe62458eb11437b2
---------------- ---------------------------
%stddev %change %stddev
\ | \
2.35 -2.0% 2.30 iostat.cpu.user
137839 ± 27% +41.1% 194529 ± 17% numa-meminfo.node0.Active
136284 ± 27% +41.7% 193101 ± 17% numa-meminfo.node0.Active(file)
2.80 ± 32% -1.6 1.16 ± 46% perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
2.80 ± 32% -1.6 1.16 ± 46% perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
1.397e+08 ± 2% +14.5% 1.599e+08 fsmark.app_overhead
4135 -10.2% 3712 fsmark.files_per_sec
48.67 -8.2% 44.67 fsmark.time.percent_of_cpu_this_job_got
34164 ± 27% +41.6% 48381 ± 17% numa-vmstat.node0.nr_active_file
276516 ± 17% +26.7% 350296 ± 10% numa-vmstat.node0.nr_dirtied
276215 ± 17% +26.5% 349400 ± 10% numa-vmstat.node0.nr_written
34164 ± 27% +41.6% 48381 ± 17% numa-vmstat.node0.nr_zone_active_file
80.71 ± 30% +38.5% 111.75 ± 20% sched_debug.cfs_rq:/.removed.load_avg.avg
270.60 ± 12% +16.0% 313.94 ± 9% sched_debug.cfs_rq:/.removed.load_avg.stddev
23.59 ± 44% +49.9% 35.35 ± 24% sched_debug.cfs_rq:/.removed.runnable_avg.avg
23.58 ± 44% +49.8% 35.33 ± 24% sched_debug.cfs_rq:/.removed.util_avg.avg
3281 ± 11% +1047.0% 37635 ±193% sched_debug.cpu.avg_idle.min
598666 +7.0% 640842 proc-vmstat.nr_dirtied
1074454 +1.1% 1086656 proc-vmstat.nr_file_pages
197321 +3.7% 204605 proc-vmstat.nr_inactive_file
598082 +6.8% 638786 proc-vmstat.nr_written
197321 +3.7% 204605 proc-vmstat.nr_zone_inactive_file
1716397 +2.3% 1755672 proc-vmstat.numa_hit
1666723 +2.4% 1705951 proc-vmstat.numa_local
145474 +7.4% 156311 proc-vmstat.pgactivate
2183965 +2.0% 2226921 proc-vmstat.pgalloc_normal
3250818 +10.5% 3591588 proc-vmstat.pgpgout
0.00 ±223% +633.3% 0.02 ± 53% perf-sched.sch_delay.avg.ms.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep
0.01 ±223% +729.3% 0.06 ± 29% perf-sched.sch_delay.avg.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
0.01 ±223% +466.7% 0.03 ± 22% perf-sched.sch_delay.max.ms.do_nanosleep.hrtimer_nanosleep.__x64_sys_nanosleep.do_syscall_64
0.00 ±223% +973.3% 0.03 ± 61% perf-sched.sch_delay.max.ms.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep
0.02 ±223% +480.6% 0.09 ± 25% perf-sched.sch_delay.max.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
0.67 ±223% +506.5% 4.04 perf-sched.wait_and_delay.avg.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
3.74 ± 52% +190.1% 10.85 ± 36% perf-sched.wait_and_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
0.68 ±223% +526.0% 4.25 ± 8% perf-sched.wait_and_delay.max.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
18.67 ±173% +375.2% 88.72 ± 23% perf-sched.wait_and_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
0.01 ±143% +212.5% 0.03 ± 47% perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
0.66 ±223% +504.3% 3.99 perf-sched.wait_time.avg.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
3.69 ± 53% +192.6% 10.79 ± 36% perf-sched.wait_time.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
0.67 ±223% +525.6% 4.17 ± 8% perf-sched.wait_time.max.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
18.65 ±173% +375.4% 88.66 ± 23% perf-sched.wait_time.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
next prev parent reply other threads:[~2024-10-30 6:36 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-23 2:37 [PATCH 0/6] prepare for dynamic server thread management NeilBrown
2024-10-23 2:37 ` [PATCH 1/6] SUNRPC: move nrthreads counting to start/stop threads NeilBrown
2024-10-23 2:37 ` [PATCH 2/6] nfsd: return hard failure for OP_SETCLIENTID when there are too many clients NeilBrown
2024-10-23 13:42 ` Chuck Lever
2024-10-23 21:47 ` NeilBrown
2024-10-23 2:37 ` [PATCH 3/6] nfs: dynamically adjust per-client DRC slot limits NeilBrown
2024-10-23 11:48 ` Jeff Layton
2024-10-23 13:55 ` Chuck Lever
2024-10-23 16:34 ` Tom Talpey
2024-10-23 21:53 ` NeilBrown
2024-10-23 2:37 ` [PATCH 4/6] nfsd: don't use sv_nrthreads in connection limiting calculations NeilBrown
2024-10-23 12:08 ` Jeff Layton
2024-10-23 21:18 ` NeilBrown
2024-10-23 2:37 ` [PATCH 5/6] sunrpc: remove all connection limit configuration NeilBrown
2024-10-23 12:50 ` Jeff Layton
2024-10-23 2:37 ` [PATCH 6/6] sunrpc: introduce possibility that requested number of threads is different from actual NeilBrown
2024-10-23 13:32 ` Jeff Layton
2024-10-30 6:35 ` kernel test robot [this message]
2024-10-23 14:00 ` [PATCH 0/6] prepare for dynamic server thread management Chuck Lever
2025-10-28 15:47 ` Jeff Layton
2025-10-28 22:36 ` NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=202410301321.d8aebe67-oliver.sang@intel.com \
--to=oliver.sang@intel.com \
--cc=Dai.Ngo@oracle.com \
--cc=chuck.lever@oracle.com \
--cc=feng.tang@intel.com \
--cc=fengwei.yin@intel.com \
--cc=jlayton@kernel.org \
--cc=kolga@netapp.com \
--cc=linux-nfs@vger.kernel.org \
--cc=lkp@intel.com \
--cc=neilb@suse.de \
--cc=oe-lkp@lists.linux.dev \
--cc=tom@talpey.com \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox