From: kernel test robot <oliver.sang@intel.com>
To: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
<linux-kernel@vger.kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Julia Lawall <julia.lawall@inria.fr>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
<aubrey.li@linux.intel.com>, <yu.c.chen@intel.com>,
<oliver.sang@intel.com>
Subject: [linus:master] [sched/core] e932c4ab38: aim9.sync_disk_cp.ops_per_sec 2.3% improvement
Date: Tue, 24 Dec 2024 16:34:05 +0800 [thread overview]
Message-ID: <202412241607.dc13db91-lkp@intel.com> (raw)
Hello,
kernel test robot noticed a 2.3% improvement of aim9.sync_disk_cp.ops_per_sec on:
commit: e932c4ab38f072ce5894b2851fea8bc5754bb8e5 ("sched/core: Prevent wakeup of ksoftirqd during idle load balance")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
testcase: aim9
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 4 threads Intel(R) Xeon(R) CPU E3-1225 v5 @ 3.30GHz (Skylake) with 16G memory
parameters:
testtime: 300s
test: sync_disk_cp
cpufreq_governor: performance
In addition to that, the commit also has significant impact on the following tests:
+------------------+-----------------------------------------------------------------------------+
| testcase: change | vm-scalability: vm-scalability.throughput 2.4% improvement |
| test machine | 4 threads Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz (Skylake) with 32G memory |
| test parameters | cpufreq_governor=performance |
| | runtime=300s |
| | test=migrate |
+------------------+-----------------------------------------------------------------------------+
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20241224/202412241607.dc13db91-lkp@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase/testtime:
gcc-12/performance/x86_64-rhel-9.4/debian-12-x86_64-20240206.cgz/lkp-skl-d06/sync_disk_cp/aim9/300s
commit:
ff47a0acfc ("sched/fair: Check idle_cpu() before need_resched() to detect ilb CPU turning busy")
e932c4ab38 ("sched/core: Prevent wakeup of ksoftirqd during idle load balance")
ff47a0acfcce309c e932c4ab38f072ce5894b2851fe
---------------- ---------------------------
%stddev %change %stddev
\ | \
779244 +2.3% 797195 aim9.sync_disk_cp.ops_per_sec
444185 ± 2% -51.7% 214738 ± 3% cpuidle..usage
40.83 ± 15% -84.5% 6.33 ± 23% perf-c2c.HITM.local
6505472 ± 12% +21.6% 7908010 ± 4% meminfo.DirectMap2M
29200 -10.3% 26194 meminfo.Shmem
0.08 ± 2% -0.0 0.06 ± 2% mpstat.cpu.all.irq%
0.04 ± 3% -0.0 0.03 ± 4% mpstat.cpu.all.soft%
2562 ± 2% -60.3% 1018 vmstat.system.cs
2343 -23.3% 1798 vmstat.system.in
117335 -53.2% 54952 sched_debug.cpu.nr_switches.avg
285639 ± 5% -71.9% 80403 ± 5% sched_debug.cpu.nr_switches.max
100396 ± 9% -77.1% 22968 ± 14% sched_debug.cpu.nr_switches.stddev
7316 -10.5% 6550 proc-vmstat.nr_shmem
58767234 +2.4% 60172860 proc-vmstat.numa_hit
58984855 +2.0% 60176451 proc-vmstat.numa_local
58862408 +2.3% 60212415 proc-vmstat.pgalloc_normal
58848231 +2.3% 60198260 proc-vmstat.pgfree
7.448e+08 +1.7% 7.574e+08 perf-stat.i.branch-instructions
1.35 -0.1 1.29 perf-stat.i.branch-miss-rate%
65562189 ± 2% -4.9% 62378502 perf-stat.i.cache-references
2571 ± 2% -60.5% 1016 perf-stat.i.context-switches
3.732e+09 +1.8% 3.797e+09 perf-stat.i.instructions
0.14 ± 3% -87.0% 0.02 perf-stat.i.metric.K/sec
7.426e+08 +1.7% 7.55e+08 perf-stat.ps.branch-instructions
65356430 ± 2% -4.9% 62171508 perf-stat.ps.cache-references
2563 ± 2% -60.5% 1012 perf-stat.ps.context-switches
3.72e+09 +1.7% 3.785e+09 perf-stat.ps.instructions
1.12e+12 +1.8% 1.14e+12 perf-stat.total.instructions
0.02 ± 25% +78.4% 0.03 ± 18% perf-sched.sch_delay.avg.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
0.02 ± 55% +82.3% 0.04 ± 16% perf-sched.sch_delay.max.ms.schedule_timeout.__wait_for_common.__flush_work.__lru_add_drain_all
0.04 ± 21% +87.3% 0.07 ± 21% perf-sched.sch_delay.max.ms.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
0.01 ± 9% +35.2% 0.02 ± 6% perf-sched.total_sch_delay.average.ms
20.34 ± 5% +111.1% 42.94 perf-sched.total_wait_and_delay.average.ms
7025 ± 6% -54.0% 3228 perf-sched.total_wait_and_delay.count.ms
3058 ± 20% +63.5% 4998 perf-sched.total_wait_and_delay.max.ms
20.33 ± 5% +111.1% 42.92 perf-sched.total_wait_time.average.ms
3058 ± 20% +63.5% 4998 perf-sched.total_wait_time.max.ms
202.58 ± 18% +94.7% 394.49 ± 9% perf-sched.wait_and_delay.avg.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
609.98 ± 5% -17.9% 500.63 perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
9.01 ± 12% +6133.8% 561.38 ± 15% perf-sched.wait_and_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
3837 ± 12% -98.6% 52.17 ± 12% perf-sched.wait_and_delay.count.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
1349 ± 39% +270.4% 4998 perf-sched.wait_and_delay.max.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
2785 ± 16% -64.1% 1001 perf-sched.wait_and_delay.max.ms.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
202.50 ± 18% +94.8% 394.38 ± 9% perf-sched.wait_time.avg.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
609.95 ± 5% -17.9% 500.56 perf-sched.wait_time.avg.ms.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
9.00 ± 12% +6140.7% 561.36 ± 15% perf-sched.wait_time.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
1349 ± 39% +270.4% 4998 perf-sched.wait_time.max.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
2785 ± 16% -64.1% 1001 perf-sched.wait_time.max.ms.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
1.51 ± 6% -0.9 0.64 ± 11% perf-profile.calltrace.cycles-pp.common_startup_64
1.51 ± 6% -0.9 0.64 ± 11% perf-profile.children.cycles-pp.common_startup_64
1.51 ± 6% -0.9 0.64 ± 11% perf-profile.children.cycles-pp.cpu_startup_entry
1.51 ± 6% -0.9 0.64 ± 11% perf-profile.children.cycles-pp.do_idle
1.12 ± 6% -0.6 0.49 ± 13% perf-profile.children.cycles-pp.cpuidle_idle_call
0.92 ± 5% -0.5 0.42 ± 17% perf-profile.children.cycles-pp.cpuidle_enter
0.92 ± 5% -0.5 0.42 ± 17% perf-profile.children.cycles-pp.cpuidle_enter_state
0.50 ± 6% -0.3 0.21 ± 12% perf-profile.children.cycles-pp.intel_idle
0.52 ± 8% -0.2 0.33 ± 7% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
0.48 ± 6% -0.2 0.31 ± 7% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
0.27 ± 18% -0.2 0.10 ± 36% perf-profile.children.cycles-pp.__schedule
0.20 ± 12% -0.2 0.04 ± 73% perf-profile.children.cycles-pp.__flush_smp_call_function_queue
0.21 ± 13% -0.2 0.06 ± 20% perf-profile.children.cycles-pp.flush_smp_call_function_queue
0.24 ± 9% -0.1 0.11 ± 12% perf-profile.children.cycles-pp.ret_from_fork
0.24 ± 9% -0.1 0.11 ± 12% perf-profile.children.cycles-pp.ret_from_fork_asm
0.24 ± 9% -0.1 0.11 ± 10% perf-profile.children.cycles-pp.kthread
0.18 ± 8% -0.1 0.05 ± 49% perf-profile.children.cycles-pp.schedule
0.31 ± 9% -0.1 0.19 ± 8% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
0.30 ± 9% -0.1 0.19 ± 8% perf-profile.children.cycles-pp.hrtimer_interrupt
0.25 ± 8% -0.1 0.16 ± 7% perf-profile.children.cycles-pp.__hrtimer_run_queues
0.11 ± 11% -0.1 0.02 ± 99% perf-profile.children.cycles-pp.try_to_block_task
0.10 ± 13% -0.1 0.02 ± 99% perf-profile.children.cycles-pp.dequeue_task_fair
0.21 ± 12% -0.1 0.14 ± 5% perf-profile.children.cycles-pp.tick_nohz_handler
0.10 ± 14% -0.1 0.02 ± 99% perf-profile.children.cycles-pp.dequeue_entities
0.17 ± 13% -0.1 0.10 ± 4% perf-profile.children.cycles-pp.update_process_times
0.11 ± 12% -0.0 0.07 ± 6% perf-profile.children.cycles-pp.sched_tick
40.09 +0.6 40.66 perf-profile.children.cycles-pp.read
0.50 ± 6% -0.3 0.21 ± 12% perf-profile.self.cycles-pp.intel_idle
0.97 ± 4% +0.1 1.05 ± 3% perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
***************************************************************************************************
lkp-skl-d03: 4 threads Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz (Skylake) with 32G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/runtime/tbox_group/test/testcase:
gcc-12/performance/x86_64-rhel-9.4/debian-12-x86_64-20240206.cgz/300s/lkp-skl-d03/migrate/vm-scalability
commit:
ff47a0acfc ("sched/fair: Check idle_cpu() before need_resched() to detect ilb CPU turning busy")
e932c4ab38 ("sched/core: Prevent wakeup of ksoftirqd during idle load balance")
ff47a0acfcce309c e932c4ab38f072ce5894b2851fe
---------------- ---------------------------
%stddev %change %stddev
\ | \
181821 -12.5% 159050 meminfo.Mapped
0.02 ± 4% -20.4% 0.01 ± 5% perf-sched.sch_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
45923 -12.8% 40022 proc-vmstat.nr_mapped
1.00 ± 99% -100.0% 0.00 ± 52% vm-scalability.free_time
2422987 +2.4% 2480833 vm-scalability.median
2422987 +2.4% 2480833 vm-scalability.throughput
90071 +2.5% 92323 vm-scalability.time.involuntary_context_switches
3.03 ± 3% -0.2 2.84 ± 3% perf-profile.calltrace.cycles-pp.unmap_vmas.exit_mmap.__mmput.exit_mm.do_exit
2.84 ± 2% -0.2 2.67 ± 3% perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.exit_mmap.__mmput.exit_mm
6.04 ± 2% -0.2 5.88 perf-profile.calltrace.cycles-pp.__mmput.exit_mm.do_exit.do_group_exit.__x64_sys_exit_group
6.06 ± 2% -0.2 5.89 perf-profile.calltrace.cycles-pp.exit_mm.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call
2.78 ± 2% -0.1 2.64 ± 2% perf-profile.calltrace.cycles-pp.filemap_map_pages.do_read_fault.do_pte_missing.__handle_mm_fault.handle_mm_fault
2.90 -0.1 2.77 ± 2% perf-profile.calltrace.cycles-pp.do_read_fault.do_pte_missing.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
0.90 ± 4% +0.1 0.95 ± 3% perf-profile.calltrace.cycles-pp.do_mmap.vm_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.92 ± 3% +0.1 0.99 ± 2% perf-profile.calltrace.cycles-pp.vm_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.77 ± 4% +0.1 0.84 ± 3% perf-profile.calltrace.cycles-pp.__mmap_region.do_mmap.vm_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.80 ± 7% +0.1 0.91 ± 7% perf-profile.calltrace.cycles-pp.do_anonymous_page.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
4.90 ± 2% -0.2 4.70 ± 2% perf-profile.children.cycles-pp.do_read_fault
6.09 ± 2% -0.2 5.92 perf-profile.children.cycles-pp.exit_mm
0.54 ± 2% -0.1 0.49 ± 8% perf-profile.children.cycles-pp.___perf_sw_event
0.39 ± 5% -0.0 0.35 ± 6% perf-profile.children.cycles-pp.vfs_open
0.20 ± 4% -0.0 0.16 ± 10% perf-profile.children.cycles-pp.opendir
0.15 ± 8% +0.0 0.19 ± 5% perf-profile.children.cycles-pp.__kmalloc_cache_noprof
0.18 ± 6% +0.0 0.22 ± 11% perf-profile.children.cycles-pp.__kernel_read
0.29 ± 5% +0.0 0.34 ± 5% perf-profile.children.cycles-pp.filemap_read
1.17 ± 4% +0.1 1.28 ± 4% perf-profile.children.cycles-pp.__memcg_slab_free_hook
0.44 ± 4% -0.0 0.40 ± 8% perf-profile.self.cycles-pp.___perf_sw_event
0.07 ± 15% -0.0 0.04 ± 71% perf-profile.self.cycles-pp.__folio_batch_add_and_move
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
reply other threads:[~2024-12-24 8:34 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=202412241607.dc13db91-lkp@intel.com \
--to=oliver.sang@intel.com \
--cc=aubrey.li@linux.intel.com \
--cc=bigeasy@linutronix.de \
--cc=julia.lawall@inria.fr \
--cc=kprateek.nayak@amd.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lkp@intel.com \
--cc=oe-lkp@lists.linux.dev \
--cc=peterz@infradead.org \
--cc=yu.c.chen@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.