All of lore.kernel.org
 help / color / mirror / Atom feed
From: kernel test robot <oliver.sang@intel.com>
To: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
	<linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	<aubrey.li@linux.intel.com>, <yu.c.chen@intel.com>,
	<oliver.sang@intel.com>
Subject: [linus:master] [sched/core]  ea9cffc0a1: stream.triad_bandwidth_MBps 1.1% improvement
Date: Fri, 20 Dec 2024 10:46:35 +0800	[thread overview]
Message-ID: <202412201007.aa43a5fa-lkp@intel.com> (raw)



Hello,

kernel test robot noticed a 1.1% improvement of stream.triad_bandwidth_MBps on:


commit: ea9cffc0a154124821531991d5afdd7e8b20d7aa ("sched/core: Remove the unnecessary need_resched() check in nohz_csd_func()")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


testcase: stream
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 4 threads Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz (Skylake) with 32G memory
parameters:

	nr_threads: 50%
	iterations: 10x
	array_size: 50000000
	loop: 100
	cpufreq_governor: performance



Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20241220/202412201007.aa43a5fa-lkp@intel.com

=========================================================================================
array_size/compiler/cpufreq_governor/iterations/kconfig/loop/nr_threads/rootfs/tbox_group/testcase:
  50000000/gcc-12/performance/10x/x86_64-rhel-9.4/100/50%/debian-12-x86_64-20240206.cgz/lkp-skl-d02/stream

commit: 
  6675ce2004 ("softirq: Allow raising SCHED_SOFTIRQ from SMP-call-function on RT kernel")
  ea9cffc0a1 ("sched/core: Remove the unnecessary need_resched() check in nohz_csd_func()")

6675ce20046d149e ea9cffc0a154124821531991d5a 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     15264           +23.1%      18793        meminfo.Shmem
      0.02 ±  4%      +0.0        0.03 ±  4%  mpstat.cpu.all.soft%
      3818           +23.1%       4700        proc-vmstat.nr_shmem
    587.28          +302.4%       2363        vmstat.system.cs
      2577            -3.5%       2488        vmstat.system.in
     36673 ±  2%    +164.6%      97051 ±  2%  sched_debug.cpu.nr_switches.avg
     53585 ± 10%    +332.2%     231568 ± 16%  sched_debug.cpu.nr_switches.max
     12003 ± 23%    +578.7%      81463 ± 24%  sched_debug.cpu.nr_switches.stddev
    578.05          +310.5%       2372        perf-stat.i.context-switches
     14.72 ±  4%     +10.8%      16.30        perf-stat.i.cpu-migrations
      0.04 ±  5%    +268.8%       0.15        perf-stat.i.metric.K/sec
    575.63          +310.5%       2363        perf-stat.ps.context-switches
     14.65 ±  4%     +10.8%      16.23        perf-stat.ps.cpu-migrations
     18760            +1.0%      18950        stream.add_bandwidth_MBps
     18759            +1.0%      18948        stream.add_bandwidth_MBps_harmonicMean
     14581            +1.2%      14751        stream.scale_bandwidth_MBps
     14580            +1.2%      14748        stream.scale_bandwidth_MBps_harmonicMean
     18289            +1.1%      18487        stream.triad_bandwidth_MBps
     18287            +1.1%      18484        stream.triad_bandwidth_MBps_harmonicMean
      0.02 ± 12%     -32.3%       0.01 ± 16%  perf-sched.sch_delay.avg.ms.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
      0.02 ± 42%     -48.6%       0.01 ±  7%  perf-sched.sch_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      0.10 ± 70%    +332.7%       0.44 ± 95%  perf-sched.sch_delay.max.ms.pipe_read.vfs_read.ksys_read.do_syscall_64
     65.81 ±  3%     -68.0%      21.05 ±  3%  perf-sched.total_wait_and_delay.average.ms
      2011          +229.0%       6618 ±  4%  perf-sched.total_wait_and_delay.count.ms
     65.80 ±  3%     -68.0%      21.04 ±  3%  perf-sched.total_wait_time.average.ms
      3.86 ±  2%    -100.0%       0.00        perf-sched.wait_and_delay.avg.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
    500.54           +24.3%     622.17 ±  5%  perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
    497.31 ± 14%     -98.6%       6.72 ±  7%  perf-sched.wait_and_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      0.02 ± 15%    -100.0%       0.00        perf-sched.wait_and_delay.avg.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open
     19.83 ± 22%    -100.0%       0.00        perf-sched.wait_and_delay.count.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
     53.83 ±  9%   +8594.4%       4680 ±  5%  perf-sched.wait_and_delay.count.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
     21.00          -100.0%       0.00        perf-sched.wait_and_delay.count.wait_for_partner.fifo_open.do_dentry_open.vfs_open
      3666 ± 51%     -72.7%       1000        perf-sched.wait_and_delay.max.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
      4.04          -100.0%       0.00        perf-sched.wait_and_delay.max.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
      1001          +136.0%       2362 ±  8%  perf-sched.wait_and_delay.max.ms.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
      0.05 ± 37%    -100.0%       0.00        perf-sched.wait_and_delay.max.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open
    500.52           +24.3%     622.15 ±  5%  perf-sched.wait_time.avg.ms.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
    497.29 ± 14%     -98.7%       6.71 ±  7%  perf-sched.wait_time.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      0.00 ±165%    +525.0%       0.00 ± 68%  perf-sched.wait_time.avg.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open
      3666 ± 51%     -72.7%       1000        perf-sched.wait_time.max.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
      1001          +136.0%       2362 ±  8%  perf-sched.wait_time.max.ms.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
      0.01 ±142%    +247.8%       0.04 ± 54%  perf-sched.wait_time.max.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open
     97.56            -0.4       97.12        perf-profile.calltrace.cycles-pp.main
      1.17 ±  2%      +0.2        1.42 ±  6%  perf-profile.calltrace.cycles-pp.common_startup_64
     97.61            -0.4       97.17        perf-profile.children.cycles-pp.main
      0.02 ±141%      +0.1        0.07 ± 14%  perf-profile.children.cycles-pp.poll_idle
      0.00            +0.1        0.06 ± 15%  perf-profile.children.cycles-pp.__hrtimer_start_range_ns
      0.00            +0.1        0.06 ± 15%  perf-profile.children.cycles-pp.dequeue_entity
      0.00            +0.1        0.06 ± 11%  perf-profile.children.cycles-pp.enqueue_dl_entity
      0.00            +0.1        0.06 ± 11%  perf-profile.children.cycles-pp.dl_server_start
      0.00            +0.1        0.06 ± 17%  perf-profile.children.cycles-pp.hrtimer_start_range_ns
      0.00            +0.1        0.06 ± 21%  perf-profile.children.cycles-pp.pick_next_task_fair
      0.00            +0.1        0.06 ± 11%  perf-profile.children.cycles-pp.update_load_avg
      0.00            +0.1        0.08 ± 17%  perf-profile.children.cycles-pp.__pick_next_task
      0.00            +0.1        0.10 ± 19%  perf-profile.children.cycles-pp.dequeue_entities
      0.00            +0.1        0.11 ± 17%  perf-profile.children.cycles-pp.dequeue_task_fair
      0.00            +0.1        0.11 ± 18%  perf-profile.children.cycles-pp.try_to_block_task
      0.01 ±223%      +0.1        0.14 ±  8%  perf-profile.children.cycles-pp.enqueue_task
      0.00            +0.1        0.13 ±  8%  perf-profile.children.cycles-pp.enqueue_task_fair
      0.01 ±223%      +0.1        0.14 ±  8%  perf-profile.children.cycles-pp.ttwu_do_activate
      0.05 ±  7%      +0.2        0.20 ± 11%  perf-profile.children.cycles-pp.__flush_smp_call_function_queue
      0.00            +0.2        0.18 ± 11%  perf-profile.children.cycles-pp.try_to_wake_up
      0.07 ± 14%      +0.2        0.25 ± 18%  perf-profile.children.cycles-pp.kthread
      0.07 ±  8%      +0.2        0.25 ± 18%  perf-profile.children.cycles-pp.ret_from_fork
      0.07 ±  8%      +0.2        0.25 ± 19%  perf-profile.children.cycles-pp.ret_from_fork_asm
      0.02 ±141%      +0.2        0.20 ± 20%  perf-profile.children.cycles-pp.schedule
      0.00            +0.2        0.19 ± 24%  perf-profile.children.cycles-pp.smpboot_thread_fn
      0.05 ±  8%      +0.2        0.25 ± 21%  perf-profile.children.cycles-pp.flush_smp_call_function_queue
      1.17 ±  2%      +0.2        1.42 ±  6%  perf-profile.children.cycles-pp.common_startup_64
      1.17 ±  2%      +0.2        1.42 ±  6%  perf-profile.children.cycles-pp.cpu_startup_entry
      1.17 ±  2%      +0.2        1.42 ±  6%  perf-profile.children.cycles-pp.do_idle
      0.09 ± 39%      +0.2        0.34 ± 20%  perf-profile.children.cycles-pp.__schedule
     97.30            -0.4       96.86        perf-profile.self.cycles-pp.main
      0.02 ±141%      +0.0        0.06 ± 14%  perf-profile.self.cycles-pp.poll_idle




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


                 reply	other threads:[~2024-12-20  2:46 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=202412201007.aa43a5fa-lkp@intel.com \
    --to=oliver.sang@intel.com \
    --cc=aubrey.li@linux.intel.com \
    --cc=kprateek.nayak@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@intel.com \
    --cc=oe-lkp@lists.linux.dev \
    --cc=peterz@infradead.org \
    --cc=yu.c.chen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.