From: Peter Zijlstra <peterz@infradead.org>
To: Tejun Heo <tj@kernel.org>
Cc: Xuewen Yan <xuewen.yan@unisoc.com>,
mingo@redhat.com, juri.lelli@redhat.com,
vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
vschneid@redhat.com, lukasz.luba@arm.com,
linux-kernel@vger.kernel.org, rui.zhang@intel.com,
di.shen@unisoc.com, ke.wang@unisoc.com, xuewen.yan94@gmail.com,
ubizjak@gmail.com, Marco Elver <elver@google.com>
Subject: Re: [RFC PATCH] sched: Add scx_cpuperf_target in sched_cpu_util()
Date: Thu, 19 Mar 2026 10:02:40 +0100 [thread overview]
Message-ID: <20260319090240.GS3738010@noisy.programming.kicks-ass.net> (raw)
In-Reply-To: <abtMmzntD4XCrG2M@slm.duckdns.org>
On Wed, Mar 18, 2026 at 03:08:43PM -1000, Tejun Heo wrote:
> On Wed, Mar 18, 2026 at 01:47:18PM +0100, Peter Zijlstra wrote:
> > On Wed, Mar 18, 2026 at 08:17:55PM +0800, Xuewen Yan wrote:
> > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > > index bf948db905ed..20adb6fede2a 100644
> > > --- a/kernel/sched/fair.c
> > > +++ b/kernel/sched/fair.c
> > > @@ -8198,7 +8198,12 @@ unsigned long effective_cpu_util(int cpu, unsigned long util_cfs,
> > >
> > > unsigned long sched_cpu_util(int cpu)
> > > {
> > > - return effective_cpu_util(cpu, cpu_util_cfs(cpu), NULL, NULL);
> > > + unsigned long util = scx_cpuperf_target(cpu);
> > > +
> > > + if (!scx_switched_all())
> > > + util += cpu_util_cfs(cpu);
> > > +
> > > + return effective_cpu_util(cpu, util, NULL, NULL);
> > > }
> >
> > This puts the common case of no ext muck into the slow path of that
> > static_branch.
> >
> > This wants to be something like:
> >
> > unsigned long sched_cpu_util(int cpu)
> > {
> > unsigned long util = cpu_util_cfs(cpu);
> >
> > if (scx_enabled()) {
> > unsigned long scx_util = scx_cpuperf_target(cpu);
> >
> > if (!scx_switched_all())
> > scx_util += util;
> >
> > util = scx_util;
> > }
> >
> > return effective_cpu_util(cpu, util, NULL, NULL);
> > }
>
> scx_switched_all() is an unlikely static branch just like scx_enabled() and
> scx_cpuperf_target() has scx_enabled() in it too, so the difference for the
> fair path between the two versions is two noop run-throughs vs. one. Either
> way is fine but it is more code for likely no discernible gain.
(added noinline to effective_cpu_util() for clarity)
So the original patch generates this:
sched_cpu_util:
1c5240: sched_cpu_util+0x0 endbr64
1c5244: sched_cpu_util+0x4 call 0x1c5249 <__fentry__>
1c5249: sched_cpu_util+0x9 push %rbp
1c524a: sched_cpu_util+0xa push %rbx
1c524b: sched_cpu_util+0xb mov %edi,%ebx
1c524d: sched_cpu_util+0xd <jump_table.1c524d>
= nop2 (if DEFAULT)
= jmp 1c5271 <sched_cpu_util+0x31> (if JUMP)
1c524f: sched_cpu_util+0xf xor %ebp,%ebp
1c5251: sched_cpu_util+0x11 <jump_table.1c5251>
= nop2 (if DEFAULT)
= jmp 1c5261 <sched_cpu_util+0x21> (if JUMP)
1c5253: sched_cpu_util+0x13 xor %edx,%edx
1c5255: sched_cpu_util+0x15 xor %esi,%esi
1c5257: sched_cpu_util+0x17 mov %ebx,%edi
1c5259: sched_cpu_util+0x19 call 0x1bc5b0 <cpu_util.constprop.0>
1c525e: sched_cpu_util+0x1e add %rax,%rbp
1c5261: sched_cpu_util+0x21 mov %rbp,%rsi
1c5264: sched_cpu_util+0x24 mov %ebx,%edi
1c5266: sched_cpu_util+0x26 xor %ecx,%ecx
1c5268: sched_cpu_util+0x28 pop %rbx
1c5269: sched_cpu_util+0x29 xor %edx,%edx
1c526b: sched_cpu_util+0x2b pop %rbp
1c526c: sched_cpu_util+0x2c jmp 0x1c5160 <effective_cpu_util>
(slowpath)
1c5271: sched_cpu_util+0x31 movslq %edi,%rdx
1c5274: sched_cpu_util+0x34 mov $0x0,%rax
1c527b: sched_cpu_util+0x3b mov 0x0(,%rdx,8),%rdx
1c5283: sched_cpu_util+0x43 mov 0xa34(%rdx,%rax,1),%ebp
1c528a: sched_cpu_util+0x4a jmp 0x1c5251 <sched_cpu_util+0x11>
While my proposal generates this:
sched_cpu_util:
1c5240: sched_cpu_util+0x0 endbr64
1c5244: sched_cpu_util+0x4 call 0x1c5249 <__fentry__>
1c5249: sched_cpu_util+0x9 push %rbx
1c524a: sched_cpu_util+0xa xor %esi,%esi
1c524c: sched_cpu_util+0xc xor %edx,%edx
1c524e: sched_cpu_util+0xe mov %edi,%ebx
1c5250: sched_cpu_util+0x10 call 0x1bc5b0 <cpu_util.constprop.0>
1c5255: sched_cpu_util+0x15 mov %rax,%rsi
1c5258: sched_cpu_util+0x18 <jump_table.1c5258>
= nop2 (if DEFAULT)
= jmp 1c5266 <sched_cpu_util+0x26> (if JUMP)
1c525a: sched_cpu_util+0x1a mov %ebx,%edi
1c525c: sched_cpu_util+0x1c xor %ecx,%ecx
1c525e: sched_cpu_util+0x1e xor %edx,%edx
1c5260: sched_cpu_util+0x20 pop %rbx
1c5261: sched_cpu_util+0x21 jmp 0x1c5160 <effective_cpu_util>
(slowpath)
1c5266: sched_cpu_util+0x26 <jump_table.1c5266>
= nop2 (if DEFAULT)
= jmp 1c527b <sched_cpu_util+0x3b> (if JUMP)
1c5268: sched_cpu_util+0x28 xor %eax,%eax
1c526a: sched_cpu_util+0x2a <jump_table.1c526a>
= nop2 (if DEFAULT)
= jmp 1c5296 <sched_cpu_util+0x56> (if JUMP)
1c526c: sched_cpu_util+0x2c mov %ebx,%edi
1c526e: sched_cpu_util+0x2e add %rax,%rsi
1c5271: sched_cpu_util+0x31 xor %ecx,%ecx
1c5273: sched_cpu_util+0x33 xor %edx,%edx
1c5275: sched_cpu_util+0x35 pop %rbx
1c5276: sched_cpu_util+0x36 jmp 0x1c5160 <effective_cpu_util>
1c527b: sched_cpu_util+0x3b movslq %ebx,%rdx
1c527e: sched_cpu_util+0x3e mov $0x0,%rax
1c5285: sched_cpu_util+0x45 mov 0x0(,%rdx,8),%rdx
1c528d: sched_cpu_util+0x4d mov 0xa34(%rdx,%rax,1),%eax
1c5294: sched_cpu_util+0x54 jmp 0x1c526a <sched_cpu_util+0x2a>
1c5296: sched_cpu_util+0x56 mov %ebx,%edi
1c5298: sched_cpu_util+0x58 mov %rax,%rsi
1c529b: sched_cpu_util+0x5b xor %ecx,%ecx
1c529d: sched_cpu_util+0x5d xor %edx,%edx
1c529f: sched_cpu_util+0x5f pop %rbx
1c52a0: sched_cpu_util+0x60 jmp 0x1c5160 <effective_cpu_util>
That fastpath is definitely better; the slowpath is worse, but that is
in part because the compilers are stupid and cannot eliminate
static_branch().
/me goes try again .. Yeah, the below patch does nothing :-( It will
happily emit scx_enabled() twice.
---
diff --git a/arch/x86/include/asm/jump_label.h b/arch/x86/include/asm/jump_label.h
index 05b16299588d..47cd1a1f9784 100644
--- a/arch/x86/include/asm/jump_label.h
+++ b/arch/x86/include/asm/jump_label.h
@@ -32,7 +32,7 @@
JUMP_TABLE_ENTRY(key, label)
#endif /* CONFIG_HAVE_JUMP_LABEL_HACK */
-static __always_inline bool arch_static_branch(struct static_key * const key, const bool branch)
+static __always_inline __const bool arch_static_branch(struct static_key * const key, const bool branch)
{
asm goto(ARCH_STATIC_BRANCH_ASM("%c0 + %c1", "%l[l_yes]")
: : "i" (key), "i" (branch) : : l_yes);
@@ -42,7 +42,7 @@ static __always_inline bool arch_static_branch(struct static_key * const key, co
return true;
}
-static __always_inline bool arch_static_branch_jump(struct static_key * const key, const bool branch)
+static __always_inline __const bool arch_static_branch_jump(struct static_key * const key, const bool branch)
{
asm goto("1:"
"jmp %l[l_yes]\n\t"
diff --git a/include/linux/compiler_attributes.h b/include/linux/compiler_attributes.h
index c16d4199bf92..553fc9f3f7eb 100644
--- a/include/linux/compiler_attributes.h
+++ b/include/linux/compiler_attributes.h
@@ -312,6 +312,7 @@
* gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-pure-function-attribute
*/
#define __pure __attribute__((__pure__))
+#define __const __attribute__((__const__))
/*
* gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-section-function-attribute
next prev parent reply other threads:[~2026-03-19 9:03 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-18 12:17 [RFC PATCH] sched: Add scx_cpuperf_target in sched_cpu_util() Xuewen Yan
2026-03-18 12:47 ` Peter Zijlstra
2026-03-18 12:55 ` Vincent Guittot
2026-03-18 13:44 ` Qais Yousef
2026-03-19 2:13 ` Xuewen Yan
2026-03-19 7:09 ` Vincent Guittot
2026-03-19 10:18 ` Lukasz Luba
2026-03-24 1:32 ` Qais Yousef
2026-03-18 13:03 ` [PATCH] sched/cpufreq: Reorder so non-SCX is common path Christian Loehle
2026-03-19 1:08 ` [RFC PATCH] sched: Add scx_cpuperf_target in sched_cpu_util() Tejun Heo
2026-03-19 2:24 ` Xuewen Yan
2026-03-19 2:38 ` Xuewen Yan
2026-03-19 9:02 ` Peter Zijlstra [this message]
2026-03-19 10:01 ` Uros Bizjak
2026-03-19 10:26 ` Peter Zijlstra
2026-03-19 11:02 ` Uros Bizjak
2026-03-19 11:12 ` Peter Zijlstra
2026-03-19 11:19 ` Uros Bizjak
2026-03-19 11:33 ` Peter Zijlstra
2026-03-19 11:22 ` Peter Zijlstra
2026-03-18 12:54 ` Christian Loehle
2026-03-19 1:21 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260319090240.GS3738010@noisy.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=bsegall@google.com \
--cc=di.shen@unisoc.com \
--cc=dietmar.eggemann@arm.com \
--cc=elver@google.com \
--cc=juri.lelli@redhat.com \
--cc=ke.wang@unisoc.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lukasz.luba@arm.com \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=rostedt@goodmis.org \
--cc=rui.zhang@intel.com \
--cc=tj@kernel.org \
--cc=ubizjak@gmail.com \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
--cc=xuewen.yan94@gmail.com \
--cc=xuewen.yan@unisoc.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.