All of lore.kernel.org
 help / color / mirror / Atom feed
From: kernel test robot <oliver.sang@intel.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
	<linux-kernel@vger.kernel.org>, Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	<oliver.sang@intel.com>
Subject: [linus:master] [rseq]  abc850e761: stress-ng.sem.sem_wait_calls_per_sec 3.1% improvement
Date: Tue, 9 Dec 2025 23:41:37 +0800	[thread overview]
Message-ID: <202512092342.3ee2de77-lkp@intel.com> (raw)



Hello,

kernel test robot noticed a 3.1% improvement of stress-ng.sem.sem_wait_calls_per_sec on:


commit: abc850e7616c91ebaa3f5ba3617ab0a104d45039 ("rseq: Provide and use rseq_update_user_cs()")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-14
test machine: 192 threads 2 sockets Intel(R) Xeon(R) 6740E  CPU @ 2.4GHz (Sierra Forest) with 256G memory
parameters:

	nr_threads: 100%
	testtime: 60s
	test: sem
	cpufreq_governor: performance



Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20251209/202512092342.3ee2de77-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  gcc-14/performance/x86_64-rhel-9.4/100%/debian-13-x86_64-20250902.cgz/lkp-srf-2sp3/sem/stress-ng/60s

commit: 
  9c37cb6e80 ("rseq: Provide static branch for runtime debugging")
  abc850e761 ("rseq: Provide and use rseq_update_user_cs()")

9c37cb6e80b8fcdd abc850e7616c91ebaa3f5ba3617 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    713480 ± 29%     -24.9%     536114 ± 28%  meminfo.Mapped
  19261235 ± 14%     -28.3%   13815751 ± 45%  perf-sched.total_wait_and_delay.count.ms
  19261235 ± 14%     -28.3%   13815751 ± 45%  perf-sched.wait_and_delay.count.[unknown].[unknown].[unknown].[unknown].[unknown]
    209285 ±  4%      -3.8%     201417 ±  3%  proc-vmstat.nr_anon_pages
    179839 ± 29%     -25.3%     134393 ± 28%  proc-vmstat.nr_mapped
      0.21            +0.0        0.22        perf-stat.i.branch-miss-rate%
 3.044e+08            +3.6%  3.154e+08        perf-stat.i.branch-misses
 1.933e+08            +3.5%  2.001e+08        perf-stat.i.context-switches
      0.93 ±  3%      +6.7%       0.99        perf-stat.i.metric.M/sec
      0.20            +0.0        0.21        perf-stat.overall.branch-miss-rate%
 2.996e+08            +3.6%  3.104e+08        perf-stat.ps.branch-misses
 1.903e+08            +3.5%   1.97e+08        perf-stat.ps.context-switches
 1.341e+10            +2.6%  1.377e+10        stress-ng.sem.ops
 2.235e+08            +2.6%  2.294e+08        stress-ng.sem.ops_per_sec
    374680            +3.1%     386364        stress-ng.sem.sem_timedwait_calls_per_sec
    374638            +3.2%     386525        stress-ng.sem.sem_trywait_calls_per_sec
    374649            +3.1%     386331        stress-ng.sem.sem_wait_calls_per_sec
 1.178e+10            +3.5%  1.219e+10        stress-ng.time.involuntary_context_switches
      7623            -1.2%       7530        stress-ng.time.system_time
      3803            +2.8%       3908        stress-ng.time.user_time
      8.44            -2.8        5.65        perf-profile.calltrace.cycles-pp.__rseq_handle_notify_resume.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
      9.62            -2.6        7.02        perf-profile.calltrace.cycles-pp.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
     59.80            -1.0       58.80        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
     60.34            -1.0       59.38        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__sched_yield
      0.83            +0.0        0.86        perf-profile.calltrace.cycles-pp._raw_spin_lock.raw_spin_rq_lock_nested.__schedule.schedule.__x64_sys_sched_yield
      0.65            +0.0        0.68        perf-profile.calltrace.cycles-pp.__update_load_avg_se.update_load_avg.put_prev_entity.pick_next_task_fair.__pick_next_task
      1.00            +0.0        1.03        perf-profile.calltrace.cycles-pp.update_load_avg.set_next_entity.pick_next_task_fair.__pick_next_task.__schedule
      1.11            +0.0        1.15        perf-profile.calltrace.cycles-pp.__enqueue_entity.put_prev_entity.pick_next_task_fair.__pick_next_task.__schedule
      1.56            +0.1        1.62        perf-profile.calltrace.cycles-pp.update_load_avg.put_prev_entity.pick_next_task_fair.__pick_next_task.__schedule
      1.73            +0.1        1.79        perf-profile.calltrace.cycles-pp.native_sched_clock.sched_clock.sched_clock_cpu.update_rq_clock.yield_task_fair
      1.77            +0.1        1.84        perf-profile.calltrace.cycles-pp.sched_clock.sched_clock_cpu.update_rq_clock.yield_task_fair.do_sched_yield
      2.13            +0.1        2.20        perf-profile.calltrace.cycles-pp.__rdgsbase_inactive.__sched_yield
      1.84            +0.1        1.90        perf-profile.calltrace.cycles-pp.sched_clock_cpu.update_rq_clock.yield_task_fair.do_sched_yield.__x64_sys_sched_yield
      1.95            +0.1        2.02        perf-profile.calltrace.cycles-pp.set_next_entity.pick_next_task_fair.__pick_next_task.__schedule.schedule
      2.06            +0.1        2.14        perf-profile.calltrace.cycles-pp.update_rq_clock.yield_task_fair.do_sched_yield.__x64_sys_sched_yield.do_syscall_64
      2.38            +0.1        2.46        perf-profile.calltrace.cycles-pp.prepare_task_switch.__schedule.schedule.__x64_sys_sched_yield.do_syscall_64
      2.98            +0.1        3.08        perf-profile.calltrace.cycles-pp.__wrgsbase_inactive.__sched_yield
      3.26            +0.1        3.38        perf-profile.calltrace.cycles-pp.put_prev_entity.pick_next_task_fair.__pick_next_task.__schedule.schedule
      5.15            +0.2        5.32        perf-profile.calltrace.cycles-pp.yield_task_fair.do_sched_yield.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe
      6.45            +0.2        6.66        perf-profile.calltrace.cycles-pp.do_sched_yield.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
      3.28            +0.2        3.50        perf-profile.calltrace.cycles-pp.rseq_update_cpu_node_id.__rseq_handle_notify_resume.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
      9.21            +0.3        9.50        perf-profile.calltrace.cycles-pp.pick_next_task_fair.__pick_next_task.__schedule.schedule.__x64_sys_sched_yield
      9.41            +0.3        9.70        perf-profile.calltrace.cycles-pp.__pick_next_task.__schedule.schedule.__x64_sys_sched_yield.do_syscall_64
     17.41            +0.5       17.88        perf-profile.calltrace.cycles-pp.__schedule.schedule.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe
     17.59            +0.5       18.07        perf-profile.calltrace.cycles-pp.schedule.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
     16.02            +0.6       16.60        perf-profile.calltrace.cycles-pp.os_xsave.__sched_yield
     24.16            +0.7       24.86        perf-profile.calltrace.cycles-pp.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
     20.98            +0.7       21.71        perf-profile.calltrace.cycles-pp.restore_fpregs_from_fpstate.switch_fpu_return.arch_exit_to_user_mode_prepare.do_syscall_64.entry_SYSCALL_64_after_hwframe
     23.42            +0.8       24.26        perf-profile.calltrace.cycles-pp.switch_fpu_return.arch_exit_to_user_mode_prepare.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
     25.22            +0.9       26.11        perf-profile.calltrace.cycles-pp.arch_exit_to_user_mode_prepare.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
      8.52            -2.6        5.88        perf-profile.children.cycles-pp.__rseq_handle_notify_resume
      9.64            -2.6        7.04        perf-profile.children.cycles-pp.exit_to_user_mode_loop
     59.97            -1.0       58.99        perf-profile.children.cycles-pp.do_syscall_64
     60.46            -1.0       59.50        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.09            +0.0        0.10        perf-profile.children.cycles-pp.propagate_entity_load_avg
      0.12 ±  4%      +0.0        0.14 ±  3%  perf-profile.children.cycles-pp.clock_gettime
      1.12            +0.0        1.16        perf-profile.children.cycles-pp.__enqueue_entity
      1.14            +0.0        1.18        perf-profile.children.cycles-pp.__update_load_avg_se
      1.97            +0.1        2.04        perf-profile.children.cycles-pp.set_next_entity
      2.38            +0.1        2.46        perf-profile.children.cycles-pp.prepare_task_switch
      2.41            +0.1        2.49        perf-profile.children.cycles-pp.__rdgsbase_inactive
      2.67            +0.1        2.77        perf-profile.children.cycles-pp.update_load_avg
      3.26            +0.1        3.37        perf-profile.children.cycles-pp.__wrgsbase_inactive
      3.30            +0.1        3.41        perf-profile.children.cycles-pp.put_prev_entity
      5.17            +0.2        5.34        perf-profile.children.cycles-pp.yield_task_fair
      6.49            +0.2        6.70        perf-profile.children.cycles-pp.do_sched_yield
      3.45            +0.2        3.68        perf-profile.children.cycles-pp.rseq_update_cpu_node_id
      9.24            +0.3        9.54        perf-profile.children.cycles-pp.pick_next_task_fair
      9.43            +0.3        9.73        perf-profile.children.cycles-pp.__pick_next_task
     17.48            +0.5       17.96        perf-profile.children.cycles-pp.__schedule
     17.61            +0.5       18.09        perf-profile.children.cycles-pp.schedule
     16.04            +0.6       16.61        perf-profile.children.cycles-pp.os_xsave
     24.18            +0.7       24.88        perf-profile.children.cycles-pp.__x64_sys_sched_yield
     21.01            +0.7       21.75        perf-profile.children.cycles-pp.restore_fpregs_from_fpstate
     23.44            +0.8       24.28        perf-profile.children.cycles-pp.switch_fpu_return
     25.24            +0.9       26.15        perf-profile.children.cycles-pp.arch_exit_to_user_mode_prepare
      0.67            +0.0        0.70        perf-profile.self.cycles-pp.___perf_sw_event
      0.92            +0.0        0.95        perf-profile.self.cycles-pp.update_curr
      1.11            +0.0        1.15        perf-profile.self.cycles-pp.__enqueue_entity
      0.80            +0.0        0.84        perf-profile.self.cycles-pp.update_load_avg
      0.70            +0.0        0.73        perf-profile.self.cycles-pp.pick_next_task_fair
      1.04            +0.0        1.08        perf-profile.self.cycles-pp.exit_to_user_mode_loop
      1.12            +0.1        1.17        perf-profile.self.cycles-pp.__update_load_avg_se
      1.78            +0.1        1.84        perf-profile.self.cycles-pp.arch_exit_to_user_mode_prepare
      1.48            +0.1        1.54        perf-profile.self.cycles-pp.prepare_task_switch
      0.69            +0.1        0.75        perf-profile.self.cycles-pp.do_syscall_64
      2.40            +0.1        2.48        perf-profile.self.cycles-pp.__rdgsbase_inactive
      3.12            +0.1        3.23        perf-profile.self.cycles-pp.__wrgsbase_inactive
     16.02            +0.6       16.61        perf-profile.self.cycles-pp.os_xsave
     21.00            +0.7       21.74        perf-profile.self.cycles-pp.restore_fpregs_from_fpstate
      0.37            +2.0        2.40        perf-profile.self.cycles-pp.__rseq_handle_notify_resume




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


                 reply	other threads:[~2025-12-09 15:41 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=202512092342.3ee2de77-lkp@intel.com \
    --to=oliver.sang@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@intel.com \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mingo@kernel.org \
    --cc=oe-lkp@lists.linux.dev \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.