* [PATCH 0/4] sched/fair: Core sched wake up path improvements
@ 2025-09-22 12:39 Fernand Sieber
2025-09-22 12:39 ` [PATCH 1/4] sched/fair: Fix cookie check on __select_idle_cpu() Fernand Sieber
` (3 more replies)
0 siblings, 4 replies; 15+ messages in thread
From: Fernand Sieber @ 2025-09-22 12:39 UTC (permalink / raw)
To: mingo, peterz
Cc: linux-kernel, juri.lelli, vincent.guittot, dietmar.eggemann,
rostedt, bsegall, mgorman, bristot, vschneid, dwmw, jschoenh,
liuyuxua, graf
This patch series addresses several issues and improvements in the core
scheduling within the fair scheduler's wake-up paths. The overall result
is better task placement reducing force idle.
The main issues addressed are as follows:
Slow path:
1. Fix incorrect cookie matching logic that wrongly discards idle cores
2. Better fallback logic when no cookie matching target is found
Fast path:
3. Add cookie checks in wake affine idle to prevent force idle
4. Enhance task selection in select idle sibling to consider cookies
Fernand Sieber (4):
sched/fair: Fix cookie check on __select_idle_cpu()
sched/fair: Still look for the idlest cpu with no matching cookie
sched/fair: Add cookie checks on wake idle path
sched/fair: Add more core cookie check in wake up fast path
kernel/sched/fair.c | 49 ++++++++++++++++++++++++++++++++------------
kernel/sched/sched.h | 41 ++++++++++++++++++++----------------
2 files changed, 59 insertions(+), 31 deletions(-)
--
2.43.0
Amazon Development Centre (South Africa) (Proprietary) Limited
29 Gogosoa Street, Observatory, Cape Town, Western Cape, 7925, South Africa
Registration Number: 2004 / 034463 / 07
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 1/4] sched/fair: Fix cookie check on __select_idle_cpu()
2025-09-22 12:39 [PATCH 0/4] sched/fair: Core sched wake up path improvements Fernand Sieber
@ 2025-09-22 12:39 ` Fernand Sieber
2025-09-23 8:42 ` K Prateek Nayak
2025-09-22 12:39 ` [PATCH 2/4] sched/fair: Still look for the idlest cpu with no matching cookie Fernand Sieber
` (2 subsequent siblings)
3 siblings, 1 reply; 15+ messages in thread
From: Fernand Sieber @ 2025-09-22 12:39 UTC (permalink / raw)
To: mingo, peterz
Cc: linux-kernel, juri.lelli, vincent.guittot, dietmar.eggemann,
rostedt, bsegall, mgorman, bristot, vschneid, dwmw, jschoenh,
liuyuxua
The __select_idle_cpu() function uses sched_cpu_cookie_match() to determine
if a task can be placed on an idle CPU. This function incorrectly returns
false when the whole core is idle but the task has a cookie, preventing
proper task placement.
Replace sched_cpu_cookie_match() with sched_core_cookie_match() which
correctly handles the idle core case. Refactor select_idle_smt() to avoid
duplicate work by checking core cookie compatibility only once in the SMT
mask.
Fixes: 97886d9dcd868 ("sched: Migration changes for core scheduling")
Signed-off-by: Fernand Sieber <sieberf@amazon.com>
---
kernel/sched/fair.c | 5 ++++-
kernel/sched/sched.h | 14 --------------
2 files changed, 4 insertions(+), 15 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index b173a059315c..43ddfc25af99 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7447,7 +7447,7 @@ static inline int sched_balance_find_dst_cpu(struct sched_domain *sd, struct tas
static inline int __select_idle_cpu(int cpu, struct task_struct *p)
{
if ((available_idle_cpu(cpu) || sched_idle_cpu(cpu)) &&
- sched_cpu_cookie_match(cpu_rq(cpu), p))
+ sched_core_cookie_match(cpu_rq(cpu), p))
return cpu;
return -1;
@@ -7546,6 +7546,9 @@ static int select_idle_smt(struct task_struct *p, struct sched_domain *sd, int t
{
int cpu;
+ if (!sched_core_cookie_match(cpu_rq(target), p))
+ return -1;
+
for_each_cpu_and(cpu, cpu_smt_mask(target), p->cpus_ptr) {
if (cpu == target)
continue;
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index be9745d104f7..4e7080123a4c 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1386,15 +1386,6 @@ extern void task_vruntime_update(struct rq *rq, struct task_struct *p, bool in_f
* A special case is that the task's cookie always matches with CPU's core
* cookie if the CPU is in an idle core.
*/
-static inline bool sched_cpu_cookie_match(struct rq *rq, struct task_struct *p)
-{
- /* Ignore cookie match if core scheduler is not enabled on the CPU. */
- if (!sched_core_enabled(rq))
- return true;
-
- return rq->core->core_cookie == p->core_cookie;
-}
-
static inline bool sched_core_cookie_match(struct rq *rq, struct task_struct *p)
{
bool idle_core = true;
@@ -1468,11 +1459,6 @@ static inline raw_spinlock_t *__rq_lockp(struct rq *rq)
return &rq->__lock;
}
-static inline bool sched_cpu_cookie_match(struct rq *rq, struct task_struct *p)
-{
- return true;
-}
-
static inline bool sched_core_cookie_match(struct rq *rq, struct task_struct *p)
{
return true;
--
2.43.0
Amazon Development Centre (South Africa) (Proprietary) Limited
29 Gogosoa Street, Observatory, Cape Town, Western Cape, 7925, South Africa
Registration Number: 2004 / 034463 / 07
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 2/4] sched/fair: Still look for the idlest cpu with no matching cookie
2025-09-22 12:39 [PATCH 0/4] sched/fair: Core sched wake up path improvements Fernand Sieber
2025-09-22 12:39 ` [PATCH 1/4] sched/fair: Fix cookie check on __select_idle_cpu() Fernand Sieber
@ 2025-09-22 12:39 ` Fernand Sieber
2025-09-23 1:51 ` K Prateek Nayak
2025-09-22 12:39 ` [PATCH 3/4] sched/fair: Add cookie checks on wake idle path Fernand Sieber
2025-09-22 12:39 ` [PATCH 4/4] sched/fair: Add more core cookie check in wake up fast path Fernand Sieber
3 siblings, 1 reply; 15+ messages in thread
From: Fernand Sieber @ 2025-09-22 12:39 UTC (permalink / raw)
To: mingo, peterz
Cc: linux-kernel, juri.lelli, vincent.guittot, dietmar.eggemann,
rostedt, bsegall, mgorman, bristot, vschneid, dwmw, jschoenh,
liuyuxua
The slow path for waking tasks currently discards all potential targets
when no cookie-matching CPU is found, leading to suboptimal task placement.
Fall back to selecting the idlest CPU when no cookie-matching target is
available, ensuring better CPU utilization while maintaining the preference
for cookie-compatible placements.
Signed-off-by: Fernand Sieber <sieberf@amazon.com>
---
kernel/sched/fair.c | 17 +++++++++++++++--
1 file changed, 15 insertions(+), 2 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 43ddfc25af99..67746899809e 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7335,7 +7335,8 @@ sched_balance_find_dst_group(struct sched_domain *sd, struct task_struct *p, int
* sched_balance_find_dst_group_cpu - find the idlest CPU among the CPUs in the group.
*/
static int
-sched_balance_find_dst_group_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
+__sched_balance_find_dst_group_cpu(struct sched_group *group, struct task_struct *p,
+ int this_cpu, bool cookie_match)
{
unsigned long load, min_load = ULONG_MAX;
unsigned int min_exit_latency = UINT_MAX;
@@ -7352,7 +7353,8 @@ sched_balance_find_dst_group_cpu(struct sched_group *group, struct task_struct *
for_each_cpu_and(i, sched_group_span(group), p->cpus_ptr) {
struct rq *rq = cpu_rq(i);
- if (!sched_core_cookie_match(rq, p))
+ /* Only matching tasks if cookie_match, else only unmatching tasks */
+ if (cookie_match ^ sched_core_cookie_match(rq, p))
continue;
if (sched_idle_cpu(i))
@@ -7391,6 +7393,17 @@ sched_balance_find_dst_group_cpu(struct sched_group *group, struct task_struct *
return shallowest_idle_cpu != -1 ? shallowest_idle_cpu : least_loaded_cpu;
}
+/*
+ * sched_balance_find_dst_group_cpu - find the idlest CPU among the CPUs in the group.
+ */
+static inline int
+sched_balance_find_dst_group_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
+{
+ int cpu = __sched_balance_find_dst_group_cpu(group, p, this_cpu, true);
+
+ return cpu >= 0 ? cpu : __sched_balance_find_dst_group_cpu(group, p, this_cpu, false);
+}
+
static inline int sched_balance_find_dst_cpu(struct sched_domain *sd, struct task_struct *p,
int cpu, int prev_cpu, int sd_flag)
{
--
2.43.0
Amazon Development Centre (South Africa) (Proprietary) Limited
29 Gogosoa Street, Observatory, Cape Town, Western Cape, 7925, South Africa
Registration Number: 2004 / 034463 / 07
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 3/4] sched/fair: Add cookie checks on wake idle path
2025-09-22 12:39 [PATCH 0/4] sched/fair: Core sched wake up path improvements Fernand Sieber
2025-09-22 12:39 ` [PATCH 1/4] sched/fair: Fix cookie check on __select_idle_cpu() Fernand Sieber
2025-09-22 12:39 ` [PATCH 2/4] sched/fair: Still look for the idlest cpu with no matching cookie Fernand Sieber
@ 2025-09-22 12:39 ` Fernand Sieber
2025-09-22 12:39 ` [PATCH 4/4] sched/fair: Add more core cookie check in wake up fast path Fernand Sieber
3 siblings, 0 replies; 15+ messages in thread
From: Fernand Sieber @ 2025-09-22 12:39 UTC (permalink / raw)
To: mingo, peterz
Cc: linux-kernel, juri.lelli, vincent.guittot, dietmar.eggemann,
rostedt, bsegall, mgorman, bristot, vschneid, dwmw, jschoenh,
liuyuxua
The wake_affine_idle() function determines whether the previous CPU or the
waking CPU are suitable for running a waking task. Currently it does not
consider core scheduling constraints.
Add cookie compatibility checks to prevent considering a CPU idle when
placing the task there would immediately cause force idle due to an
incompatible sibling task. This reduces unnecessary force idle scenarios
in the wake-up path.
Signed-off-by: Fernand Sieber <sieberf@amazon.com>
---
kernel/sched/fair.c | 19 +++++++++++++------
kernel/sched/sched.h | 33 ++++++++++++++++++++++++++-------
2 files changed, 39 insertions(+), 13 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 67746899809e..78b36225a039 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7235,7 +7235,7 @@ static int wake_wide(struct task_struct *p)
* for the overloaded case.
*/
static int
-wake_affine_idle(int this_cpu, int prev_cpu, int sync)
+wake_affine_idle(struct task_struct *p, int this_cpu, int prev_cpu, int sync)
{
/*
* If this_cpu is idle, it implies the wakeup is from interrupt
@@ -7249,17 +7249,24 @@ wake_affine_idle(int this_cpu, int prev_cpu, int sync)
* a cpufreq perspective, it's better to have higher utilisation
* on one CPU.
*/
- if (available_idle_cpu(this_cpu) && cpus_share_cache(this_cpu, prev_cpu))
- return available_idle_cpu(prev_cpu) ? prev_cpu : this_cpu;
+ if (available_idle_cpu(this_cpu) &&
+ cpus_share_cache(this_cpu, prev_cpu) &&
+ sched_core_cookie_match(cpu_rq(this_cpu), p)) {
+ return available_idle_cpu(prev_cpu) &&
+ sched_core_cookie_match(cpu_rq(prev_cpu), p) ?
+ prev_cpu : this_cpu;
+ }
if (sync) {
struct rq *rq = cpu_rq(this_cpu);
- if ((rq->nr_running - cfs_h_nr_delayed(rq)) == 1)
+ if (((rq->nr_running - cfs_h_nr_delayed(rq)) == 1) &&
+ sched_core_cookie_match_sync(rq, p))
return this_cpu;
}
- if (available_idle_cpu(prev_cpu))
+ if (available_idle_cpu(prev_cpu) &&
+ sched_core_cookie_match(cpu_rq(prev_cpu), p))
return prev_cpu;
return nr_cpumask_bits;
@@ -7314,7 +7321,7 @@ static int wake_affine(struct sched_domain *sd, struct task_struct *p,
int target = nr_cpumask_bits;
if (sched_feat(WA_IDLE))
- target = wake_affine_idle(this_cpu, prev_cpu, sync);
+ target = wake_affine_idle(p, this_cpu, prev_cpu, sync);
if (sched_feat(WA_WEIGHT) && target == nr_cpumask_bits)
target = wake_affine_weight(sd, p, this_cpu, prev_cpu, sync);
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 4e7080123a4c..97cc8c66519e 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1386,27 +1386,41 @@ extern void task_vruntime_update(struct rq *rq, struct task_struct *p, bool in_f
* A special case is that the task's cookie always matches with CPU's core
* cookie if the CPU is in an idle core.
*/
-static inline bool sched_core_cookie_match(struct rq *rq, struct task_struct *p)
+static inline bool __sched_core_cookie_match(struct rq *rq,
+ struct task_struct *p,
+ bool sync)
{
- bool idle_core = true;
int cpu;
/* Ignore cookie match if core scheduler is not enabled on the CPU. */
if (!sched_core_enabled(rq))
return true;
+ if (rq->core->core_cookie == p->core_cookie)
+ return true;
+
for_each_cpu(cpu, cpu_smt_mask(cpu_of(rq))) {
- if (!available_idle_cpu(cpu)) {
- idle_core = false;
- break;
- }
+ if (sync && cpu_of(rq) == cpu)
+ continue;
+ if (!available_idle_cpu(cpu))
+ return false;
}
/*
* A CPU in an idle core is always the best choice for tasks with
* cookies.
*/
- return idle_core || rq->core->core_cookie == p->core_cookie;
+ return true;
+}
+
+static inline bool sched_core_cookie_match(struct rq *rq, struct task_struct *p)
+{
+ return __sched_core_cookie_match(rq, p, false);
+}
+
+static inline bool sched_core_cookie_match_sync(struct rq *rq, struct task_struct *p)
+{
+ return __sched_core_cookie_match(rq, p, true);
}
static inline bool sched_group_cookie_match(struct rq *rq,
@@ -1464,6 +1478,11 @@ static inline bool sched_core_cookie_match(struct rq *rq, struct task_struct *p)
return true;
}
+static inline bool sched_core_cookie_match_sync(struct rq *rq, struct task_struct *p)
+{
+ return true;
+}
+
static inline bool sched_group_cookie_match(struct rq *rq,
struct task_struct *p,
struct sched_group *group)
--
2.43.0
Amazon Development Centre (South Africa) (Proprietary) Limited
29 Gogosoa Street, Observatory, Cape Town, Western Cape, 7925, South Africa
Registration Number: 2004 / 034463 / 07
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 4/4] sched/fair: Add more core cookie check in wake up fast path
2025-09-22 12:39 [PATCH 0/4] sched/fair: Core sched wake up path improvements Fernand Sieber
` (2 preceding siblings ...)
2025-09-22 12:39 ` [PATCH 3/4] sched/fair: Add cookie checks on wake idle path Fernand Sieber
@ 2025-09-22 12:39 ` Fernand Sieber
2025-09-23 8:55 ` K Prateek Nayak
3 siblings, 1 reply; 15+ messages in thread
From: Fernand Sieber @ 2025-09-22 12:39 UTC (permalink / raw)
To: mingo, peterz
Cc: linux-kernel, juri.lelli, vincent.guittot, dietmar.eggemann,
rostedt, bsegall, mgorman, bristot, vschneid, dwmw, jschoenh,
liuyuxua
The fast path in select_idle_sibling() can place tasks on CPUs without
considering core scheduling constraints, potentially causing immediate
force idle when the sibling runs an incompatible task.
Add cookie compatibility checks before selecting a CPU in the fast path.
This prevents placing waking tasks on CPUs where the sibling is running
an incompatible task, reducing force idle occurrences.
Signed-off-by: Fernand Sieber <sieberf@amazon.com>
---
kernel/sched/fair.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 78b36225a039..a9cbb0e9bb43 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7578,7 +7578,7 @@ static int select_idle_smt(struct task_struct *p, struct sched_domain *sd, int t
*/
if (!cpumask_test_cpu(cpu, sched_domain_span(sd)))
continue;
- if (available_idle_cpu(cpu) || sched_idle_cpu(cpu))
+ if (__select_idle_cpu(cpu, p) != -1)
return cpu;
}
@@ -7771,7 +7771,7 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
*/
lockdep_assert_irqs_disabled();
- if ((available_idle_cpu(target) || sched_idle_cpu(target)) &&
+ if ((__select_idle_cpu(target, p) != -1) &&
asym_fits_cpu(task_util, util_min, util_max, target))
return target;
@@ -7779,7 +7779,7 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
* If the previous CPU is cache affine and idle, don't be stupid:
*/
if (prev != target && cpus_share_cache(prev, target) &&
- (available_idle_cpu(prev) || sched_idle_cpu(prev)) &&
+ (__select_idle_cpu(prev, p) != -1) &&
asym_fits_cpu(task_util, util_min, util_max, prev)) {
if (!static_branch_unlikely(&sched_cluster_active) ||
@@ -7811,7 +7811,7 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
if (recent_used_cpu != prev &&
recent_used_cpu != target &&
cpus_share_cache(recent_used_cpu, target) &&
- (available_idle_cpu(recent_used_cpu) || sched_idle_cpu(recent_used_cpu)) &&
+ (__select_idle_cpu(recent_used_cpu, p) != -1) &&
cpumask_test_cpu(recent_used_cpu, p->cpus_ptr) &&
asym_fits_cpu(task_util, util_min, util_max, recent_used_cpu)) {
--
2.43.0
Amazon Development Centre (South Africa) (Proprietary) Limited
29 Gogosoa Street, Observatory, Cape Town, Western Cape, 7925, South Africa
Registration Number: 2004 / 034463 / 07
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 2/4] sched/fair: Still look for the idlest cpu with no matching cookie
2025-09-22 12:39 ` [PATCH 2/4] sched/fair: Still look for the idlest cpu with no matching cookie Fernand Sieber
@ 2025-09-23 1:51 ` K Prateek Nayak
2025-09-23 7:32 ` Fernand Sieber
0 siblings, 1 reply; 15+ messages in thread
From: K Prateek Nayak @ 2025-09-23 1:51 UTC (permalink / raw)
To: Fernand Sieber, mingo, peterz
Cc: linux-kernel, juri.lelli, vincent.guittot, dietmar.eggemann,
rostedt, bsegall, mgorman, bristot, vschneid, dwmw, jschoenh,
liuyuxua
Hello Fernand,
On 9/22/2025 6:09 PM, Fernand Sieber wrote:
> @@ -7391,6 +7393,17 @@ sched_balance_find_dst_group_cpu(struct sched_group *group, struct task_struct *
> return shallowest_idle_cpu != -1 ? shallowest_idle_cpu : least_loaded_cpu;
Based on the above return in __sched_balance_find_dst_group_cpu(), it
should always return a valid CPU since "least_loaded_cpu" is initialized
to "this_cpu".
> }
>
> +/*
> + * sched_balance_find_dst_group_cpu - find the idlest CPU among the CPUs in the group.
> + */
> +static inline int
> +sched_balance_find_dst_group_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
> +{
> + int cpu = __sched_balance_find_dst_group_cpu(group, p, this_cpu, true);
> +
> + return cpu >= 0 ? cpu : __sched_balance_find_dst_group_cpu(group, p, this_cpu, false);
So, under what circumstance does "cpu" here turns out to be < 0?
Am I missing something?
> +}
> +
> static inline int sched_balance_find_dst_cpu(struct sched_domain *sd, struct task_struct *p,
> int cpu, int prev_cpu, int sd_flag)
> {
--
Thanks and Regards,
Prateek
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/4] sched/fair: Still look for the idlest cpu with no matching cookie
2025-09-23 1:51 ` K Prateek Nayak
@ 2025-09-23 7:32 ` Fernand Sieber
2025-09-23 7:44 ` Fernand Sieber
0 siblings, 1 reply; 15+ messages in thread
From: Fernand Sieber @ 2025-09-23 7:32 UTC (permalink / raw)
To: K Prateek Nayak
Cc: mingo, peterz, linux-kernel, juri.lelli, vincent.guittot,
dietmar.eggemann, rostedt, bsegall, mgorman, bristot, vschneid,
dwmw, jschoenh, liuyuxua
Hi Prateek,
On 9/23/2025 7:21 AM, K Prateek Nayak wrote:
> Based on the above return in __sched_balance_find_dst_group_cpu(), it
> should always return a valid CPU since "least_loaded_cpu" is initialized
> to "this_cpu".
>
> So, under what circumstance does "cpu" here turns out to be < 0?
> Am I missing something?
Hey Prateek. Thanks for the catch. I'll fix as follows for next rev:
+/*
+ * sched_balance_find_dst_group_cpu - find the idlest CPU among the CPUs in the group.
+ */
+static inline int
+sched_balance_find_dst_group_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
+{
+ int cpu = __sched_balance_find_dst_group_cpu(group, p, -1, true);
+ return cpu >= 0 ? cpu : __sched_balance_find_dst_group_cpu(group, p, this_cpu, false);
+}
+
Thanks,
Fernand
Amazon Development Centre (South Africa) (Proprietary) Limited
29 Gogosoa Street, Observatory, Cape Town, Western Cape, 7925, South Africa
Registration Number: 2004 / 034463 / 07
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/4] sched/fair: Still look for the idlest cpu with no matching cookie
2025-09-23 7:32 ` Fernand Sieber
@ 2025-09-23 7:44 ` Fernand Sieber
0 siblings, 0 replies; 15+ messages in thread
From: Fernand Sieber @ 2025-09-23 7:44 UTC (permalink / raw)
To: K Prateek Nayak
Cc: mingo, peterz, linux-kernel, juri.lelli, vincent.guittot,
dietmar.eggemann, rostedt, bsegall, mgorman, bristot, vschneid,
dwmw, jschoenh, liuyuxua
On 9/23/2025 9:32 AM, Fernand Sieber wrote:
> Hey Prateek. Thanks for the catch. I'll fix as follows for next rev:
>
> +/*
> + * sched_balance_find_dst_group_cpu - find the idlest CPU among the CPUs in the group.
> + */
> +static inline int
> +sched_balance_find_dst_group_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
> +{
> + int cpu = __sched_balance_find_dst_group_cpu(group, p, -1, true);
> + return cpu >= 0 ? cpu : __sched_balance_find_dst_group_cpu(group, p, this_cpu, false);
> +}
On second thoughts, this doesn't work as it breaks the original fallback path.
A better modification would be this:
@@ -7341,7 +7341,7 @@ __sched_balance_find_dst_group_cpu(struct sched_group *group, struct task_struct
unsigned long load, min_load = ULONG_MAX;
unsigned int min_exit_latency = UINT_MAX;
u64 latest_idle_timestamp = 0;
- int least_loaded_cpu = this_cpu;
+ int least_loaded_cpu = -1;
int shallowest_idle_cpu = -1;
int i;
@@ -7357,6 +7357,9 @@ __sched_balance_find_dst_group_cpu(struct sched_group *group, struct task_struct
if (cookie_match ^ sched_core_cookie_match(rq, p))
continue;
+ if (least_loaded_cpu < 0)
+ least_loaded_cpu = this_cpu;
+
if (sched_idle_cpu(i))
return i;
Thanks,
Fernand
Amazon Development Centre (South Africa) (Proprietary) Limited
29 Gogosoa Street, Observatory, Cape Town, Western Cape, 7925, South Africa
Registration Number: 2004 / 034463 / 07
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/4] sched/fair: Fix cookie check on __select_idle_cpu()
2025-09-22 12:39 ` [PATCH 1/4] sched/fair: Fix cookie check on __select_idle_cpu() Fernand Sieber
@ 2025-09-23 8:42 ` K Prateek Nayak
2025-09-25 6:35 ` Madadi Vineeth Reddy
0 siblings, 1 reply; 15+ messages in thread
From: K Prateek Nayak @ 2025-09-23 8:42 UTC (permalink / raw)
To: Fernand Sieber, mingo, peterz
Cc: linux-kernel, juri.lelli, vincent.guittot, dietmar.eggemann,
rostedt, bsegall, mgorman, bristot, vschneid, dwmw, jschoenh,
liuyuxua
Hello Fernand,
On 9/22/2025 6:09 PM, Fernand Sieber wrote:
> @@ -7447,7 +7447,7 @@ static inline int sched_balance_find_dst_cpu(struct sched_domain *sd, struct tas
> static inline int __select_idle_cpu(int cpu, struct task_struct *p)
> {
> if ((available_idle_cpu(cpu) || sched_idle_cpu(cpu)) &&
> - sched_cpu_cookie_match(cpu_rq(cpu), p))
> + sched_core_cookie_match(cpu_rq(cpu), p))
__select_idle_cpu() is only called when "has_idle_core" is false which
means it is highly unlikely we'll find an idle core. In such cases, just
matching the cookie should be sufficient right?
Do you have any benchmark numbers which shows a large difference with
these changes?
> return cpu;
>
> return -1;
> @@ -7546,6 +7546,9 @@ static int select_idle_smt(struct task_struct *p, struct sched_domain *sd, int t
> {
> int cpu;
>
> + if (!sched_core_cookie_match(cpu_rq(target), p))
> + return -1;
> +
select_idle_smt() is again called when "has_idle_core" is false and
sched_cpu_cookie_match() should be sufficient for most part here too.
> for_each_cpu_and(cpu, cpu_smt_mask(target), p->cpus_ptr) {
> if (cpu == target)
> continue;
--
Thanks and Regards,
Prateek
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 4/4] sched/fair: Add more core cookie check in wake up fast path
2025-09-22 12:39 ` [PATCH 4/4] sched/fair: Add more core cookie check in wake up fast path Fernand Sieber
@ 2025-09-23 8:55 ` K Prateek Nayak
2025-09-23 9:30 ` Fernand Sieber
0 siblings, 1 reply; 15+ messages in thread
From: K Prateek Nayak @ 2025-09-23 8:55 UTC (permalink / raw)
To: Fernand Sieber, mingo, peterz
Cc: linux-kernel, juri.lelli, vincent.guittot, dietmar.eggemann,
rostedt, bsegall, mgorman, bristot, vschneid, dwmw, jschoenh,
liuyuxua
Hello Fernand,
On 9/22/2025 6:09 PM, Fernand Sieber wrote:
> The fast path in select_idle_sibling() can place tasks on CPUs without
> considering core scheduling constraints, potentially causing immediate
> force idle when the sibling runs an incompatible task.
>
> Add cookie compatibility checks before selecting a CPU in the fast path.
> This prevents placing waking tasks on CPUs where the sibling is running
> an incompatible task, reducing force idle occurrences.
>
> Signed-off-by: Fernand Sieber <sieberf@amazon.com>
> ---
> kernel/sched/fair.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 78b36225a039..a9cbb0e9bb43 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -7578,7 +7578,7 @@ static int select_idle_smt(struct task_struct *p, struct sched_domain *sd, int t
> */
> if (!cpumask_test_cpu(cpu, sched_domain_span(sd)))
> continue;
> - if (available_idle_cpu(cpu) || sched_idle_cpu(cpu))
> + if (__select_idle_cpu(cpu, p) != -1)
So with Patch 1, you already check for cookie matching while entering
select_idle_smt() and now, each pass of the loop again does a
sched_core_cookie_match() which internally loops through the smt mask
again! Seems wasteful.
> return cpu;
> }
>
> @@ -7771,7 +7771,7 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
> */
> lockdep_assert_irqs_disabled();
>
> - if ((available_idle_cpu(target) || sched_idle_cpu(target)) &&
> + if ((__select_idle_cpu(target, p) != -1) &&
> asym_fits_cpu(task_util, util_min, util_max, target))
> return target;
>
> @@ -7779,7 +7779,7 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
> * If the previous CPU is cache affine and idle, don't be stupid:
> */
> if (prev != target && cpus_share_cache(prev, target) &&
> - (available_idle_cpu(prev) || sched_idle_cpu(prev)) &&
> + (__select_idle_cpu(prev, p) != -1) &&
> asym_fits_cpu(task_util, util_min, util_max, prev)) {
>
> if (!static_branch_unlikely(&sched_cluster_active) ||
> @@ -7811,7 +7811,7 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
> if (recent_used_cpu != prev &&
> recent_used_cpu != target &&
> cpus_share_cache(recent_used_cpu, target) &&
> - (available_idle_cpu(recent_used_cpu) || sched_idle_cpu(recent_used_cpu)) &&
> + (__select_idle_cpu(recent_used_cpu, p) != -1) &&
On an SMT-8 system, all the looping over smt mask per wakeup will add
up. Is that not a concern? A single task with core cookie enabled will
add massive overhead for all wakeup in the system.
> cpumask_test_cpu(recent_used_cpu, p->cpus_ptr) &&
> asym_fits_cpu(task_util, util_min, util_max, recent_used_cpu)) {
>
--
Thanks and Regards,
Prateek
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 4/4] sched/fair: Add more core cookie check in wake up fast path
2025-09-23 8:55 ` K Prateek Nayak
@ 2025-09-23 9:30 ` Fernand Sieber
2025-09-24 4:21 ` K Prateek Nayak
0 siblings, 1 reply; 15+ messages in thread
From: Fernand Sieber @ 2025-09-23 9:30 UTC (permalink / raw)
To: K Prateek Nayak
Cc: mingo, peterz, linux-kernel, juri.lelli, vincent.guittot,
dietmar.eggemann, rostedt, bsegall, mgorman, bristot, vschneid,
dwmw, jschoenh, liuyuxua
Hi Prateek,
On 9/23/2025 2:25 PM, K Prateek Nayak wrote:
> So with Patch 1, you already check for cookie matching while entering
> select_idle_smt() and now, each pass of the loop again does a
> sched_core_cookie_match() which internally loops through the smt mask
> again! Seems wasteful.
Right. The change in select_idle_smt() is unnecessary.
> On an SMT-8 system, all the looping over smt mask per wakeup will add
> up. Is that not a concern? A single task with core cookie enabled will
> add massive overhead for all wakeup in the system.
In such a scenario there should generally be no looping because I introduced an
early return in patch 3 in __sched_core_cookie_match(). Perhaps it's worth
extracting this early return as standalone optimization patch? Something like
this:
@@ -1404,10 +1404,12 @@ static inline bool sched_core_cookie_match(struct rq *rq, struct task_struct *p)
if (!sched_core_enabled(rq))
return true;
+ if (rq->core->core_cookie == p->core_cookie)
+ return true;
+
for_each_cpu(cpu, cpu_smt_mask(cpu_of(rq))) {
if (!available_idle_cpu(cpu)) {
- idle_core = false;
- break;
+ return false;
}
}
@@ -1415,7 +1417,7 @@ static inline bool sched_core_cookie_match(struct rq *rq, struct task_struct *p)
* A CPU in an idle core is always the best choice for tasks with
* cookies.
*/
- return idle_core || rq->core->core_cookie == p->core_cookie;
+ return true;
}
Thanks,
Fernand
Amazon Development Centre (South Africa) (Proprietary) Limited
29 Gogosoa Street, Observatory, Cape Town, Western Cape, 7925, South Africa
Registration Number: 2004 / 034463 / 07
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 4/4] sched/fair: Add more core cookie check in wake up fast path
2025-09-23 9:30 ` Fernand Sieber
@ 2025-09-24 4:21 ` K Prateek Nayak
2025-11-05 15:34 ` [PATCH 4/4] sched/fair: Add more core cookie check in wake up Fernand Sieber
2025-11-20 10:30 ` [PATCH 4/4] sched/fair: Add more core cookie check in wake up fast path Fernand Sieber
0 siblings, 2 replies; 15+ messages in thread
From: K Prateek Nayak @ 2025-09-24 4:21 UTC (permalink / raw)
To: Fernand Sieber
Cc: mingo, peterz, linux-kernel, juri.lelli, vincent.guittot,
dietmar.eggemann, rostedt, bsegall, mgorman, bristot, vschneid,
dwmw, jschoenh, liuyuxua
Hello Fernand,
On 9/23/2025 3:00 PM, Fernand Sieber wrote:
> Hi Prateek,
>
> On 9/23/2025 2:25 PM, K Prateek Nayak wrote:
>> So with Patch 1, you already check for cookie matching while entering
>> select_idle_smt() and now, each pass of the loop again does a
>> sched_core_cookie_match() which internally loops through the smt mask
>> again! Seems wasteful.
>
> Right. The change in select_idle_smt() is unnecessary.
>
>> On an SMT-8 system, all the looping over smt mask per wakeup will add
>> up. Is that not a concern? A single task with core cookie enabled will
>> add massive overhead for all wakeup in the system.
>
> In such a scenario there should generally be no looping because I introduced an
> early return in patch 3 in __sched_core_cookie_match(). Perhaps it's worth
> extracting this early return as standalone optimization patch? Something like
> this:
Yes, that would be great! Thank you. And also please include some
benchmark numbers either in improved core utilization or the benchmark
results actually improving from these changes.
It would be great to know how much things improve by :)
--
Thanks and Regards,
Prateek
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/4] sched/fair: Fix cookie check on __select_idle_cpu()
2025-09-23 8:42 ` K Prateek Nayak
@ 2025-09-25 6:35 ` Madadi Vineeth Reddy
0 siblings, 0 replies; 15+ messages in thread
From: Madadi Vineeth Reddy @ 2025-09-25 6:35 UTC (permalink / raw)
To: K Prateek Nayak, Fernand Sieber
Cc: mingo, peterz, linux-kernel, juri.lelli, vincent.guittot,
dietmar.eggemann, rostedt, bsegall, mgorman, bristot, vschneid,
dwmw, jschoenh, liuyuxua, Madadi Vineeth Reddy
Hi Prateek,
On 23/09/25 14:12, K Prateek Nayak wrote:
> Hello Fernand,
>
> On 9/22/2025 6:09 PM, Fernand Sieber wrote:
>> @@ -7447,7 +7447,7 @@ static inline int sched_balance_find_dst_cpu(struct sched_domain *sd, struct tas
>> static inline int __select_idle_cpu(int cpu, struct task_struct *p)
>> {
>> if ((available_idle_cpu(cpu) || sched_idle_cpu(cpu)) &&
>> - sched_cpu_cookie_match(cpu_rq(cpu), p))
>> + sched_core_cookie_match(cpu_rq(cpu), p))
>
> __select_idle_cpu() is only called when "has_idle_core" is false which
> means it is highly unlikely we'll find an idle core. In such cases, just
> matching the cookie should be sufficient right?
Agreed. The only code path I could find where __select_idle_cpu() is called
with has_idle_core == true is in the non-CONFIG_SCHED_SMT case, which is not
relevant to core scheduling.
Thanks,
Madadi Vineeth Reddy
>
> Do you have any benchmark numbers which shows a large difference with
> these changes?
>
>> return cpu;
>>
>> return -1;
>> @@ -7546,6 +7546,9 @@ static int select_idle_smt(struct task_struct *p, struct sched_domain *sd, int t
>> {
>> int cpu;
>>
>> + if (!sched_core_cookie_match(cpu_rq(target), p))
>> + return -1;
>> +
>
> select_idle_smt() is again called when "has_idle_core" is false and
> sched_cpu_cookie_match() should be sufficient for most part here too.
>
>> for_each_cpu_and(cpu, cpu_smt_mask(target), p->cpus_ptr) {
>> if (cpu == target)
>> continue;
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 4/4] sched/fair: Add more core cookie check in wake up
2025-09-24 4:21 ` K Prateek Nayak
@ 2025-11-05 15:34 ` Fernand Sieber
2025-11-20 10:30 ` [PATCH 4/4] sched/fair: Add more core cookie check in wake up fast path Fernand Sieber
1 sibling, 0 replies; 15+ messages in thread
From: Fernand Sieber @ 2025-11-05 15:34 UTC (permalink / raw)
To: K Prateek Nayak
Cc: mingo, peterz, linux-kernel, juri.lelli, vincent.guittot,
dietmar.eggemann, rostedt, bsegall, mgorman, bristot, vschneid,
dwmw, jschoenh, liuyuxua
Hi Prateek,
For now, I've extracted the core cookie match as standalone commit:
https://lore.kernel.org/lkml/20251105152538.470586-1-sieberf@amazon.com/
--Fernand
Amazon Development Centre (South Africa) (Proprietary) Limited
29 Gogosoa Street, Observatory, Cape Town, Western Cape, 7925, South Africa
Registration Number: 2004 / 034463 / 07
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 4/4] sched/fair: Add more core cookie check in wake up fast path
2025-09-24 4:21 ` K Prateek Nayak
2025-11-05 15:34 ` [PATCH 4/4] sched/fair: Add more core cookie check in wake up Fernand Sieber
@ 2025-11-20 10:30 ` Fernand Sieber
1 sibling, 0 replies; 15+ messages in thread
From: Fernand Sieber @ 2025-11-20 10:30 UTC (permalink / raw)
To: K Prateek Nayak
Cc: mingo, peterz, linux-kernel, juri.lelli, vincent.guittot,
dietmar.eggemann, rostedt, bsegall, mgorman, vschneid, dwmw,
jschoenh, liuyuxua
Hi Prateek,
On 9/24/2025 9:51 AM, K Prateek Nayak wrote:
> Yes, that would be great! Thank you. And also please include some
> benchmark numbers either in improved core utilization or the benchmark
> results actually improving from these changes.
>
> It would be great to know how much things improve by :)
I've run some benchmarking and I've determined most of the perf gains I
observed can be isolated to a subset of the changes in this series.
Therefore I've repackaged it in a single patch, which also considers
other feedback gathered in this series:
https://lore.kernel.org/lkml/20251120101955.968586-1-sieberf@amazon.com/T/#u
I've also included benchmark test results. This current patch series can
be cancelled in favor of the new patch.
--
Thanks,
Fernand
Amazon Development Centre (South Africa) (Proprietary) Limited
29 Gogosoa Street, Observatory, Cape Town, Western Cape, 7925, South Africa
Registration Number: 2004 / 034463 / 07
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2025-11-20 10:31 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-22 12:39 [PATCH 0/4] sched/fair: Core sched wake up path improvements Fernand Sieber
2025-09-22 12:39 ` [PATCH 1/4] sched/fair: Fix cookie check on __select_idle_cpu() Fernand Sieber
2025-09-23 8:42 ` K Prateek Nayak
2025-09-25 6:35 ` Madadi Vineeth Reddy
2025-09-22 12:39 ` [PATCH 2/4] sched/fair: Still look for the idlest cpu with no matching cookie Fernand Sieber
2025-09-23 1:51 ` K Prateek Nayak
2025-09-23 7:32 ` Fernand Sieber
2025-09-23 7:44 ` Fernand Sieber
2025-09-22 12:39 ` [PATCH 3/4] sched/fair: Add cookie checks on wake idle path Fernand Sieber
2025-09-22 12:39 ` [PATCH 4/4] sched/fair: Add more core cookie check in wake up fast path Fernand Sieber
2025-09-23 8:55 ` K Prateek Nayak
2025-09-23 9:30 ` Fernand Sieber
2025-09-24 4:21 ` K Prateek Nayak
2025-11-05 15:34 ` [PATCH 4/4] sched/fair: Add more core cookie check in wake up Fernand Sieber
2025-11-20 10:30 ` [PATCH 4/4] sched/fair: Add more core cookie check in wake up fast path Fernand Sieber
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox