* [RFC PATCH] sched/feec: Simplify the traversal of pd'cpus
@ 2025-08-12 9:33 Xuewen Yan
2025-08-12 21:29 ` Christian Loehle
2025-08-14 8:46 ` Dietmar Eggemann
0 siblings, 2 replies; 10+ messages in thread
From: Xuewen Yan @ 2025-08-12 9:33 UTC (permalink / raw)
To: dietmar.eggemann, mingo, peterz, juri.lelli, vincent.guittot
Cc: rostedt, bsegall, mgorman, vschneid, vdonnefort, ke.wang,
xuewen.yan94, linux-kernel
Now we use for_each_cpu() to traversal all pd's cpus,
it is in order to compute the pd_cap. This approach may
result in some unnecessary judgments.
We can simply calculate pd_cap as follows:
pd_cap = cpu_actual_cap * cpumask_weight(pd_cpus);
Then we can AND pd'scpus, sd's cpus and task's cpus_ptr
before traversing, which can save some unnecessary judgment.
Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
---
kernel/sched/fair.c | 14 ++++----------
1 file changed, 4 insertions(+), 10 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index b173a059315c..e47fe94d6889 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8330,18 +8330,12 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
cpu_actual_cap = get_actual_cpu_capacity(cpu);
eenv.cpu_cap = cpu_actual_cap;
- eenv.pd_cap = 0;
+ eenv.pd_cap = cpu_actual_cap * cpumask_weight(cpus);
- for_each_cpu(cpu, cpus) {
- struct rq *rq = cpu_rq(cpu);
-
- eenv.pd_cap += cpu_actual_cap;
-
- if (!cpumask_test_cpu(cpu, sched_domain_span(sd)))
- continue;
+ cpumask_and(cpus, cpus, sched_domain_span(sd));
- if (!cpumask_test_cpu(cpu, p->cpus_ptr))
- continue;
+ for_each_cpu_and(cpu, cpus, p->cpus_ptr) {
+ struct rq *rq = cpu_rq(cpu);
util = cpu_util(cpu, p, cpu, 0);
cpu_cap = capacity_of(cpu);
--
2.25.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [RFC PATCH] sched/feec: Simplify the traversal of pd'cpus
2025-08-12 9:33 [RFC PATCH] sched/feec: Simplify the traversal of pd'cpus Xuewen Yan
@ 2025-08-12 21:29 ` Christian Loehle
2025-08-14 8:46 ` Dietmar Eggemann
1 sibling, 0 replies; 10+ messages in thread
From: Christian Loehle @ 2025-08-12 21:29 UTC (permalink / raw)
To: Xuewen Yan, dietmar.eggemann, mingo, peterz, juri.lelli,
vincent.guittot
Cc: rostedt, bsegall, mgorman, vschneid, vdonnefort, ke.wang,
xuewen.yan94, linux-kernel
On 8/12/25 10:33, Xuewen Yan wrote:
> Now we use for_each_cpu() to traversal all pd's cpus,
> it is in order to compute the pd_cap. This approach may
> result in some unnecessary judgments.
> We can simply calculate pd_cap as follows:
>
> pd_cap = cpu_actual_cap * cpumask_weight(pd_cpus);
>
> Then we can AND pd'scpus, sd's cpus and task's cpus_ptr
> before traversing, which can save some unnecessary judgment.
IMO this would be clearer if it's:
Calculate pd_cap as follows:
...
instead of traversing ...
Other than that LGTM.
>
> Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
> ---
> kernel/sched/fair.c | 14 ++++----------
> 1 file changed, 4 insertions(+), 10 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index b173a059315c..e47fe94d6889 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -8330,18 +8330,12 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
> cpu_actual_cap = get_actual_cpu_capacity(cpu);
>
> eenv.cpu_cap = cpu_actual_cap;
> - eenv.pd_cap = 0;
> + eenv.pd_cap = cpu_actual_cap * cpumask_weight(cpus);
>
> - for_each_cpu(cpu, cpus) {
> - struct rq *rq = cpu_rq(cpu);
> -
> - eenv.pd_cap += cpu_actual_cap;
> -
> - if (!cpumask_test_cpu(cpu, sched_domain_span(sd)))
> - continue;
> + cpumask_and(cpus, cpus, sched_domain_span(sd));
>
> - if (!cpumask_test_cpu(cpu, p->cpus_ptr))
> - continue;
> + for_each_cpu_and(cpu, cpus, p->cpus_ptr) {
> + struct rq *rq = cpu_rq(cpu);
>
> util = cpu_util(cpu, p, cpu, 0);
> cpu_cap = capacity_of(cpu);
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH] sched/feec: Simplify the traversal of pd'cpus
2025-08-12 9:33 [RFC PATCH] sched/feec: Simplify the traversal of pd'cpus Xuewen Yan
2025-08-12 21:29 ` Christian Loehle
@ 2025-08-14 8:46 ` Dietmar Eggemann
2025-08-14 9:52 ` Xuewen Yan
1 sibling, 1 reply; 10+ messages in thread
From: Dietmar Eggemann @ 2025-08-14 8:46 UTC (permalink / raw)
To: Xuewen Yan, mingo, peterz, juri.lelli, vincent.guittot
Cc: rostedt, bsegall, mgorman, vschneid, vdonnefort, ke.wang,
xuewen.yan94, linux-kernel
On 12.08.25 10:33, Xuewen Yan wrote:
> Now we use for_each_cpu() to traversal all pd's cpus,
> it is in order to compute the pd_cap. This approach may
> result in some unnecessary judgments.
> We can simply calculate pd_cap as follows:
>
> pd_cap = cpu_actual_cap * cpumask_weight(pd_cpus);
>
> Then we can AND pd'scpus, sd's cpus and task's cpus_ptr
> before traversing, which can save some unnecessary judgment.
>
> Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
> ---
> kernel/sched/fair.c | 14 ++++----------
> 1 file changed, 4 insertions(+), 10 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index b173a059315c..e47fe94d6889 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -8330,18 +8330,12 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
Just a thought ...
for (; pd; pd = pd->next)
cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr); <-- (1)
cpumask_and(cpus, perf_domain_span(pd), cpu_online_mask);
if (cpumask_empty(cpus))
continue; <-- (2)
Can you not mask cpus already early in the pd loop (1) and then profit
from (2) in these rare cases? IIRC, the sd only plays a role here in
exclusive cpusets scenarios which I don't thing anybody deploys with EAS?
> cpu_actual_cap = get_actual_cpu_capacity(cpu);
>
> eenv.cpu_cap = cpu_actual_cap;
> - eenv.pd_cap = 0;
> + eenv.pd_cap = cpu_actual_cap * cpumask_weight(cpus);
>
> - for_each_cpu(cpu, cpus) {
> - struct rq *rq = cpu_rq(cpu);
> -
> - eenv.pd_cap += cpu_actual_cap;
> -
> - if (!cpumask_test_cpu(cpu, sched_domain_span(sd)))
> - continue;
> + cpumask_and(cpus, cpus, sched_domain_span(sd));
>
> - if (!cpumask_test_cpu(cpu, p->cpus_ptr))
> - continue;
> + for_each_cpu_and(cpu, cpus, p->cpus_ptr) {
> + struct rq *rq = cpu_rq(cpu);
>
> util = cpu_util(cpu, p, cpu, 0);
> cpu_cap = capacity_of(cpu);
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH] sched/feec: Simplify the traversal of pd'cpus
2025-08-14 8:46 ` Dietmar Eggemann
@ 2025-08-14 9:52 ` Xuewen Yan
2025-08-15 13:01 ` Dietmar Eggemann
0 siblings, 1 reply; 10+ messages in thread
From: Xuewen Yan @ 2025-08-14 9:52 UTC (permalink / raw)
To: Dietmar Eggemann, Christian Loehle
Cc: Xuewen Yan, mingo, peterz, juri.lelli, vincent.guittot, rostedt,
bsegall, mgorman, vschneid, vdonnefort, ke.wang, linux-kernel
Hi Dietmar,
On Thu, Aug 14, 2025 at 4:46 PM Dietmar Eggemann
<dietmar.eggemann@arm.com> wrote:
>
>
>
> On 12.08.25 10:33, Xuewen Yan wrote:
> > Now we use for_each_cpu() to traversal all pd's cpus,
> > it is in order to compute the pd_cap. This approach may
> > result in some unnecessary judgments.
> > We can simply calculate pd_cap as follows:
> >
> > pd_cap = cpu_actual_cap * cpumask_weight(pd_cpus);
> >
> > Then we can AND pd'scpus, sd's cpus and task's cpus_ptr
> > before traversing, which can save some unnecessary judgment.
> >
> > Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
> > ---
> > kernel/sched/fair.c | 14 ++++----------
> > 1 file changed, 4 insertions(+), 10 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index b173a059315c..e47fe94d6889 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -8330,18 +8330,12 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
>
> Just a thought ...
>
> for (; pd; pd = pd->next)
>
> cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr); <-- (1)
> cpumask_and(cpus, perf_domain_span(pd), cpu_online_mask);
>
>
> if (cpumask_empty(cpus))
> continue; <-- (2)
>
> Can you not mask cpus already early in the pd loop (1) and then profit
> from (2) in these rare cases?
I do not think the cpus_ptr chould place before the pd_cap calc,
because the following scenario should be considered:
the task's cpus_ptr cpus: 0,1,2,3
pd's cpus: 0,1,2,3,4,5,6
the pd's cap = cpu_cap * 6;
if we cpumask_and(pd'scpus, p->cpus_ptr),
the cpumask_weight = 4,
the pd's cap = cpu_cap *4.
> IIRC, the sd only plays a role here in
> exclusive cpusets scenarios which I don't thing anybody deploys with EAS?
I am also wondering if the check for SD's CPUs could be removed...
Thanks!
>
> > cpu_actual_cap = get_actual_cpu_capacity(cpu);
> >
> > eenv.cpu_cap = cpu_actual_cap;
> > - eenv.pd_cap = 0;
> > + eenv.pd_cap = cpu_actual_cap * cpumask_weight(cpus);
> >
> > - for_each_cpu(cpu, cpus) {
> > - struct rq *rq = cpu_rq(cpu);
> > -
> > - eenv.pd_cap += cpu_actual_cap;
> > -
> > - if (!cpumask_test_cpu(cpu, sched_domain_span(sd)))
> > - continue;
> > + cpumask_and(cpus, cpus, sched_domain_span(sd));
> >
> > - if (!cpumask_test_cpu(cpu, p->cpus_ptr))
> > - continue;
> > + for_each_cpu_and(cpu, cpus, p->cpus_ptr) {
> > + struct rq *rq = cpu_rq(cpu);
> >
> > util = cpu_util(cpu, p, cpu, 0);
> > cpu_cap = capacity_of(cpu);
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH] sched/feec: Simplify the traversal of pd'cpus
2025-08-14 9:52 ` Xuewen Yan
@ 2025-08-15 13:01 ` Dietmar Eggemann
2025-08-18 11:05 ` Xuewen Yan
0 siblings, 1 reply; 10+ messages in thread
From: Dietmar Eggemann @ 2025-08-15 13:01 UTC (permalink / raw)
To: Xuewen Yan, Christian Loehle
Cc: Xuewen Yan, mingo, peterz, juri.lelli, vincent.guittot, rostedt,
bsegall, mgorman, vschneid, vdonnefort, ke.wang, linux-kernel
On 14.08.25 10:52, Xuewen Yan wrote:
> Hi Dietmar,
>
> On Thu, Aug 14, 2025 at 4:46 PM Dietmar Eggemann
> <dietmar.eggemann@arm.com> wrote:
>>
>> On 12.08.25 10:33, Xuewen Yan wrote:
[...]
>> Can you not mask cpus already early in the pd loop (1) and then profit
>> from (2) in these rare cases?
>
> I do not think the cpus_ptr chould place before the pd_cap calc,
> because the following scenario should be considered:
> the task's cpus_ptr cpus: 0,1,2,3
> pd's cpus: 0,1,2,3,4,5,6
> the pd's cap = cpu_cap * 6;
> if we cpumask_and(pd'scpus, p->cpus_ptr),
> the cpumask_weight = 4,
> the pd's cap = cpu_cap *4.
Yes, you're right! Missed this one.
>> IIRC, the sd only plays a role here in
>> exclusive cpusets scenarios which I don't thing anybody deploys with EAS?
>
> I am also wondering if the check for SD's CPUs could be removed...
Still not 100% sure here. I would have to play with cpusets and EAS a
little bit more. Are you thinking that in those cases p->cpus_ptr
already covers the cpuset restriction so that the sd mask isn't necessary?
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH] sched/feec: Simplify the traversal of pd'cpus
2025-08-15 13:01 ` Dietmar Eggemann
@ 2025-08-18 11:05 ` Xuewen Yan
2025-08-18 15:24 ` Dietmar Eggemann
0 siblings, 1 reply; 10+ messages in thread
From: Xuewen Yan @ 2025-08-18 11:05 UTC (permalink / raw)
To: Dietmar Eggemann
Cc: Christian Loehle, Xuewen Yan, mingo, peterz, juri.lelli,
vincent.guittot, rostedt, bsegall, mgorman, vschneid, vdonnefort,
ke.wang, linux-kernel
On Fri, Aug 15, 2025 at 9:01 PM Dietmar Eggemann
<dietmar.eggemann@arm.com> wrote:
>
> On 14.08.25 10:52, Xuewen Yan wrote:
> > Hi Dietmar,
> >
> > On Thu, Aug 14, 2025 at 4:46 PM Dietmar Eggemann
> > <dietmar.eggemann@arm.com> wrote:
> >>
> >> On 12.08.25 10:33, Xuewen Yan wrote:
>
> [...]
>
> >> Can you not mask cpus already early in the pd loop (1) and then profit
> >> from (2) in these rare cases?
> >
> > I do not think the cpus_ptr chould place before the pd_cap calc,
> > because the following scenario should be considered:
> > the task's cpus_ptr cpus: 0,1,2,3
> > pd's cpus: 0,1,2,3,4,5,6
> > the pd's cap = cpu_cap * 6;
> > if we cpumask_and(pd'scpus, p->cpus_ptr),
> > the cpumask_weight = 4,
> > the pd's cap = cpu_cap *4.
>
> Yes, you're right! Missed this one.
>
> >> IIRC, the sd only plays a role here in
> >> exclusive cpusets scenarios which I don't thing anybody deploys with EAS?
> >
> > I am also wondering if the check for SD's CPUs could be removed...
>
> Still not 100% sure here. I would have to play with cpusets and EAS a
> little bit more. Are you thinking that in those cases p->cpus_ptr
> already covers the cpuset restriction so that the sd mask isn't necessary?
I am not familiar with cpuset, so I can't guarantee this. Similarly, I
also need to learn more about cpuset and cpu topology before I can
answer this question.
Thanks!
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH] sched/feec: Simplify the traversal of pd'cpus
2025-08-18 11:05 ` Xuewen Yan
@ 2025-08-18 15:24 ` Dietmar Eggemann
2025-08-19 2:02 ` Xuewen Yan
0 siblings, 1 reply; 10+ messages in thread
From: Dietmar Eggemann @ 2025-08-18 15:24 UTC (permalink / raw)
To: Xuewen Yan
Cc: Christian Loehle, Xuewen Yan, mingo, peterz, juri.lelli,
vincent.guittot, rostedt, bsegall, mgorman, vschneid, vdonnefort,
ke.wang, linux-kernel
On 18.08.25 12:05, Xuewen Yan wrote:
> On Fri, Aug 15, 2025 at 9:01 PM Dietmar Eggemann
> <dietmar.eggemann@arm.com> wrote:
>>
>> On 14.08.25 10:52, Xuewen Yan wrote:
>>> Hi Dietmar,
>>>
>>> On Thu, Aug 14, 2025 at 4:46 PM Dietmar Eggemann
>>> <dietmar.eggemann@arm.com> wrote:
>>>>
>>>> On 12.08.25 10:33, Xuewen Yan wrote:
>>
>> [...]
>>
>>>> Can you not mask cpus already early in the pd loop (1) and then profit
>>>> from (2) in these rare cases?
>>>
>>> I do not think the cpus_ptr chould place before the pd_cap calc,
>>> because the following scenario should be considered:
>>> the task's cpus_ptr cpus: 0,1,2,3
>>> pd's cpus: 0,1,2,3,4,5,6
>>> the pd's cap = cpu_cap * 6;
>>> if we cpumask_and(pd'scpus, p->cpus_ptr),
>>> the cpumask_weight = 4,
>>> the pd's cap = cpu_cap *4.
>>
>> Yes, you're right! Missed this one.
>>
>>>> IIRC, the sd only plays a role here in
>>>> exclusive cpusets scenarios which I don't thing anybody deploys with EAS?
>>>
>>> I am also wondering if the check for SD's CPUs could be removed...
>>
>> Still not 100% sure here. I would have to play with cpusets and EAS a
>> little bit more. Are you thinking that in those cases p->cpus_ptr
>> already covers the cpuset restriction so that the sd mask isn't necessary?
>
> I am not familiar with cpuset, so I can't guarantee this. Similarly, I
> also need to learn more about cpuset and cpu topology before I can
> answer this question.
Looks like we do need also the sd cpumask here.
Consider this tri-gear system:
# cat /sys/devices/system/cpu/cpu*/cpu_capacity
160
160
160
160
498
498
1024
1024
and 2 exclusive cpusets cs1={0-1,4,6} and cs2={2-3,5,7}, so EAS is
possible in all 3 root_domains (/, /cs1, /cs2):
...
[ 74.691104] CPU1 attaching sched-domain(s):
[ 74.691180] domain-0: span=0-1 level=MC
[ 74.691244] groups: 1:{ span=1 cap=159 }, 0:{ span=0 cap=155 }
[ 74.693453] domain-1: span=0-1,4,6 level=PKG
[ 74.693534] groups: 0:{ span=0-1 cap=314 }, 4:{ span=4 cap=496 },
6:{ span=6 cap=986 }
...
[ 74.697890] root domain span: 0-1,4,6
[ 74.697994] root_domain 2-3,5,7: pd6:{ cpus=6-7 nr_pstate=4 } pd4:{
cpus=4-5 nr_pstate=4 } pd0:{ cpus=0-3 nr_pstate=4 }
[ 74.698922] root_domain 0-1,4,6: pd6:{ cpus=6-7 nr_pstate=4 } pd4:{
cpus=4-5 nr_pstate=4 } pd0:{ cpus=0-3 nr_pstate=4 }
sd = rcu_dereference(*this_cpu_ptr(&sd_asym_cpucapacity));
Tasks running in '/' only have the sd to reduce the CPU affinity correctly.
...
[001] 5290.935663: select_task_rq_fair: kworker/u33:3 382 prev_cpu=0
[001] 5290.935696: select_task_rq_fair: kworker/u33:3 382 prev_cpu=0
pd=6-7 online=0-7 sd=0-1,4,6 cpus_ptr=0-7
[001] 5290.935753: select_task_rq_fair: kworker/u33:3 382 prev_cpu=0
pd=4-5 online=0-7 sd=0-1,4,6 cpus_ptr=0-7
[001] 5290.935779: select_task_rq_fair: kworker/u33:3 382 prev_cpu=0
pd=0-3 online=0-7 sd=0-1,4,6 cpus_ptr=0-7
...
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH] sched/feec: Simplify the traversal of pd'cpus
2025-08-18 15:24 ` Dietmar Eggemann
@ 2025-08-19 2:02 ` Xuewen Yan
2025-08-19 14:01 ` Dietmar Eggemann
0 siblings, 1 reply; 10+ messages in thread
From: Xuewen Yan @ 2025-08-19 2:02 UTC (permalink / raw)
To: Dietmar Eggemann
Cc: Christian Loehle, Xuewen Yan, mingo, peterz, juri.lelli,
vincent.guittot, rostedt, bsegall, mgorman, vschneid, vdonnefort,
ke.wang, linux-kernel
On Mon, Aug 18, 2025 at 11:24 PM Dietmar Eggemann
<dietmar.eggemann@arm.com> wrote:
>
> On 18.08.25 12:05, Xuewen Yan wrote:
> > On Fri, Aug 15, 2025 at 9:01 PM Dietmar Eggemann
> > <dietmar.eggemann@arm.com> wrote:
> >>
> >> On 14.08.25 10:52, Xuewen Yan wrote:
> >>> Hi Dietmar,
> >>>
> >>> On Thu, Aug 14, 2025 at 4:46 PM Dietmar Eggemann
> >>> <dietmar.eggemann@arm.com> wrote:
> >>>>
> >>>> On 12.08.25 10:33, Xuewen Yan wrote:
> >>
> >> [...]
> >>
> >>>> Can you not mask cpus already early in the pd loop (1) and then profit
> >>>> from (2) in these rare cases?
> >>>
> >>> I do not think the cpus_ptr chould place before the pd_cap calc,
> >>> because the following scenario should be considered:
> >>> the task's cpus_ptr cpus: 0,1,2,3
> >>> pd's cpus: 0,1,2,3,4,5,6
> >>> the pd's cap = cpu_cap * 6;
> >>> if we cpumask_and(pd'scpus, p->cpus_ptr),
> >>> the cpumask_weight = 4,
> >>> the pd's cap = cpu_cap *4.
> >>
> >> Yes, you're right! Missed this one.
> >>
> >>>> IIRC, the sd only plays a role here in
> >>>> exclusive cpusets scenarios which I don't thing anybody deploys with EAS?
> >>>
> >>> I am also wondering if the check for SD's CPUs could be removed...
> >>
> >> Still not 100% sure here. I would have to play with cpusets and EAS a
> >> little bit more. Are you thinking that in those cases p->cpus_ptr
> >> already covers the cpuset restriction so that the sd mask isn't necessary?
> >
> > I am not familiar with cpuset, so I can't guarantee this. Similarly, I
> > also need to learn more about cpuset and cpu topology before I can
> > answer this question.
>
> Looks like we do need also the sd cpumask here.
>
> Consider this tri-gear system:
>
> # cat /sys/devices/system/cpu/cpu*/cpu_capacity
> 160
> 160
> 160
> 160
> 498
> 498
> 1024
> 1024
>
> and 2 exclusive cpusets cs1={0-1,4,6} and cs2={2-3,5,7}, so EAS is
> possible in all 3 root_domains (/, /cs1, /cs2):
Isn't your CPU an ARM Dynamiq architecture?
In my understanding, for Dynamiq arch, there is only one MC domain...
Did I miss something?
Thanks!
>
> ...
> [ 74.691104] CPU1 attaching sched-domain(s):
> [ 74.691180] domain-0: span=0-1 level=MC
> [ 74.691244] groups: 1:{ span=1 cap=159 }, 0:{ span=0 cap=155 }
> [ 74.693453] domain-1: span=0-1,4,6 level=PKG
> [ 74.693534] groups: 0:{ span=0-1 cap=314 }, 4:{ span=4 cap=496 },
> 6:{ span=6 cap=986 }
> ...
> [ 74.697890] root domain span: 0-1,4,6
> [ 74.697994] root_domain 2-3,5,7: pd6:{ cpus=6-7 nr_pstate=4 } pd4:{
> cpus=4-5 nr_pstate=4 } pd0:{ cpus=0-3 nr_pstate=4 }
> [ 74.698922] root_domain 0-1,4,6: pd6:{ cpus=6-7 nr_pstate=4 } pd4:{
> cpus=4-5 nr_pstate=4 } pd0:{ cpus=0-3 nr_pstate=4 }
>
>
> sd = rcu_dereference(*this_cpu_ptr(&sd_asym_cpucapacity));
>
>
> Tasks running in '/' only have the sd to reduce the CPU affinity correctly.
>
> ...
> [001] 5290.935663: select_task_rq_fair: kworker/u33:3 382 prev_cpu=0
> [001] 5290.935696: select_task_rq_fair: kworker/u33:3 382 prev_cpu=0
> pd=6-7 online=0-7 sd=0-1,4,6 cpus_ptr=0-7
> [001] 5290.935753: select_task_rq_fair: kworker/u33:3 382 prev_cpu=0
> pd=4-5 online=0-7 sd=0-1,4,6 cpus_ptr=0-7
> [001] 5290.935779: select_task_rq_fair: kworker/u33:3 382 prev_cpu=0
> pd=0-3 online=0-7 sd=0-1,4,6 cpus_ptr=0-7
> ...
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH] sched/feec: Simplify the traversal of pd'cpus
2025-08-19 2:02 ` Xuewen Yan
@ 2025-08-19 14:01 ` Dietmar Eggemann
2025-08-20 11:09 ` Xuewen Yan
0 siblings, 1 reply; 10+ messages in thread
From: Dietmar Eggemann @ 2025-08-19 14:01 UTC (permalink / raw)
To: Xuewen Yan
Cc: Christian Loehle, Xuewen Yan, mingo, peterz, juri.lelli,
vincent.guittot, rostedt, bsegall, mgorman, vschneid, vdonnefort,
ke.wang, linux-kernel
On 19.08.25 03:02, Xuewen Yan wrote:
> On Mon, Aug 18, 2025 at 11:24 PM Dietmar Eggemann
> <dietmar.eggemann@arm.com> wrote:
>>
>> On 18.08.25 12:05, Xuewen Yan wrote:
>>> On Fri, Aug 15, 2025 at 9:01 PM Dietmar Eggemann
>>> <dietmar.eggemann@arm.com> wrote:
>>>>
>>>> On 14.08.25 10:52, Xuewen Yan wrote:
>>>>> Hi Dietmar,
>>>>>
>>>>> On Thu, Aug 14, 2025 at 4:46 PM Dietmar Eggemann
>>>>> <dietmar.eggemann@arm.com> wrote:
>>>>>>
>>>>>> On 12.08.25 10:33, Xuewen Yan wrote:
[...]
>> Looks like we do need also the sd cpumask here.
>>
>> Consider this tri-gear system:
>>
>> # cat /sys/devices/system/cpu/cpu*/cpu_capacity
>> 160
>> 160
>> 160
>> 160
>> 498
>> 498
>> 1024
>> 1024
>>
>> and 2 exclusive cpusets cs1={0-1,4,6} and cs2={2-3,5,7}, so EAS is
>> possible in all 3 root_domains (/, /cs1, /cs2):
>
> Isn't your CPU an ARM Dynamiq architecture?
> In my understanding, for Dynamiq arch, there is only one MC domain...
> Did I miss something?
Ah, should have mentioned that this is qemu. I used a dts file
(qemu-system-aarch64 ... -dtb foo.dtb) with individual
'next-level-cache' entries for the CPUs {0-3}, {4-5} and {6-7} so that's
why you see MC & PKG. Removing those gives you a system with only MC:
[ 106.986828] CPU2 attaching sched-domain(s):
[ 106.987846] domain-0: span=2-3,5,7 level=MC
[ 106.987941] groups: 2:{ span=2 cap=159 }, 3:{ span=3 cap=154 }, 5:{
span=5 cap=495 }, 7:{ span=7 cap=991 }
[ 106.988842] CPU3 attaching sched-domain(s):
[ 106.989096] domain-0: span=2-3,5,7 level=MC
[ 106.989136] groups: 3:{ span=3 cap=154 }, 5:{ span=5 cap=495 }, 7:{
span=7 cap=991 }, 2:{ span=2 cap=159 }
[ 106.989679] CPU5 attaching sched-domain(s):
[ 106.989692] domain-0: span=2-3,5,7 level=MC
[ 106.989773] groups: 5:{ span=5 cap=495 }, 7:{ span=7 cap=991 }, 2:{
span=2 cap=159 }, 3:{ span=3 cap=154 }
[ 106.990466] CPU7 attaching sched-domain(s):
[ 106.990482] domain-0: span=2-3,5,7 level=MC
[ 106.990632] groups: 7:{ span=7 cap=991 }, 2:{ span=2 cap=159 }, 3:{
span=3 cap=154 }, 5:{ span=5 cap=495 }
[ 106.997604] root domain span: 2-3,5,7
[ 106.998267] CPU0 attaching sched-domain(s):
[ 106.998278] domain-0: span=0-1,4,6 level=MC
[ 106.998295] groups: 0:{ span=0 cap=159 }, 1:{ span=1 cap=160 }, 4:{
span=4 cap=496 }, 6:{ span=6 cap=995 }
[ 106.998584] CPU1 attaching sched-domain(s):
[ 106.998592] domain-0: span=0-1,4,6 level=MC
[ 106.998604] groups: 1:{ span=1 cap=160 }, 4:{ span=4 cap=496 }, 6:{
span=6 cap=995 }, 0:{ span=0 cap=159 }
[ 106.999477] CPU4 attaching sched-domain(s):
[ 106.999487] domain-0: span=0-1,4,6 level=MC
[ 106.999504] groups: 4:{ span=4 cap=496 }, 6:{ span=6 cap=995 }, 0:{
span=0 cap=159 }, 1:{ span=1 cap=160 }
[ 107.000070] CPU6 attaching sched-domain(s):
[ 107.000082] domain-0: span=0-1,4,6 level=MC
[ 107.000095] groups: 6:{ span=6 cap=995 }, 0:{ span=0 cap=159 }, 1:{
span=1 cap=160 }, 4:{ span=4 cap=496 }
[ 107.000721] root domain span: 0-1,4,6
[ 107.001152] root_domain 2-3,5,7: pd6:{ cpus=6-7 nr_pstate=4 } pd4:{
cpus=4-5 nr_pstate=4 } pd0:{ cpus=0-3 nr_pstate=4 }
[ 107.001869] root_domain 0-1,4,6: pd6:{ cpus=6-7 nr_pstate=4 } pd4:{
cpus=4-5 nr_pstate=4 } pd0:{ cpus=0-3 nr_pstate=4 }
[...]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH] sched/feec: Simplify the traversal of pd'cpus
2025-08-19 14:01 ` Dietmar Eggemann
@ 2025-08-20 11:09 ` Xuewen Yan
0 siblings, 0 replies; 10+ messages in thread
From: Xuewen Yan @ 2025-08-20 11:09 UTC (permalink / raw)
To: Dietmar Eggemann
Cc: Christian Loehle, Xuewen Yan, mingo, peterz, juri.lelli,
vincent.guittot, rostedt, bsegall, mgorman, vschneid, vdonnefort,
ke.wang, linux-kernel
On Tue, Aug 19, 2025 at 10:01 PM Dietmar Eggemann
<dietmar.eggemann@arm.com> wrote:
>
> On 19.08.25 03:02, Xuewen Yan wrote:
> > On Mon, Aug 18, 2025 at 11:24 PM Dietmar Eggemann
> > <dietmar.eggemann@arm.com> wrote:
> >>
> >> On 18.08.25 12:05, Xuewen Yan wrote:
> >>> On Fri, Aug 15, 2025 at 9:01 PM Dietmar Eggemann
> >>> <dietmar.eggemann@arm.com> wrote:
> >>>>
> >>>> On 14.08.25 10:52, Xuewen Yan wrote:
> >>>>> Hi Dietmar,
> >>>>>
> >>>>> On Thu, Aug 14, 2025 at 4:46 PM Dietmar Eggemann
> >>>>> <dietmar.eggemann@arm.com> wrote:
> >>>>>>
> >>>>>> On 12.08.25 10:33, Xuewen Yan wrote:
>
> [...]
>
> >> Looks like we do need also the sd cpumask here.
> >>
> >> Consider this tri-gear system:
> >>
> >> # cat /sys/devices/system/cpu/cpu*/cpu_capacity
> >> 160
> >> 160
> >> 160
> >> 160
> >> 498
> >> 498
> >> 1024
> >> 1024
> >>
> >> and 2 exclusive cpusets cs1={0-1,4,6} and cs2={2-3,5,7}, so EAS is
> >> possible in all 3 root_domains (/, /cs1, /cs2):
> >
> > Isn't your CPU an ARM Dynamiq architecture?
> > In my understanding, for Dynamiq arch, there is only one MC domain...
> > Did I miss something?
>
> Ah, should have mentioned that this is qemu. I used a dts file
> (qemu-system-aarch64 ... -dtb foo.dtb) with individual
> 'next-level-cache' entries for the CPUs {0-3}, {4-5} and {6-7} so that's
> why you see MC & PKG. Removing those gives you a system with only MC:
Thank you for your patience in explaining:)
Looks like we do need also the sd cpumask here.
Thanks!
>
> [ 106.986828] CPU2 attaching sched-domain(s):
> [ 106.987846] domain-0: span=2-3,5,7 level=MC
> [ 106.987941] groups: 2:{ span=2 cap=159 }, 3:{ span=3 cap=154 }, 5:{
> span=5 cap=495 }, 7:{ span=7 cap=991 }
> [ 106.988842] CPU3 attaching sched-domain(s):
> [ 106.989096] domain-0: span=2-3,5,7 level=MC
> [ 106.989136] groups: 3:{ span=3 cap=154 }, 5:{ span=5 cap=495 }, 7:{
> span=7 cap=991 }, 2:{ span=2 cap=159 }
> [ 106.989679] CPU5 attaching sched-domain(s):
> [ 106.989692] domain-0: span=2-3,5,7 level=MC
> [ 106.989773] groups: 5:{ span=5 cap=495 }, 7:{ span=7 cap=991 }, 2:{
> span=2 cap=159 }, 3:{ span=3 cap=154 }
> [ 106.990466] CPU7 attaching sched-domain(s):
> [ 106.990482] domain-0: span=2-3,5,7 level=MC
> [ 106.990632] groups: 7:{ span=7 cap=991 }, 2:{ span=2 cap=159 }, 3:{
> span=3 cap=154 }, 5:{ span=5 cap=495 }
> [ 106.997604] root domain span: 2-3,5,7
> [ 106.998267] CPU0 attaching sched-domain(s):
> [ 106.998278] domain-0: span=0-1,4,6 level=MC
> [ 106.998295] groups: 0:{ span=0 cap=159 }, 1:{ span=1 cap=160 }, 4:{
> span=4 cap=496 }, 6:{ span=6 cap=995 }
> [ 106.998584] CPU1 attaching sched-domain(s):
> [ 106.998592] domain-0: span=0-1,4,6 level=MC
> [ 106.998604] groups: 1:{ span=1 cap=160 }, 4:{ span=4 cap=496 }, 6:{
> span=6 cap=995 }, 0:{ span=0 cap=159 }
> [ 106.999477] CPU4 attaching sched-domain(s):
> [ 106.999487] domain-0: span=0-1,4,6 level=MC
> [ 106.999504] groups: 4:{ span=4 cap=496 }, 6:{ span=6 cap=995 }, 0:{
> span=0 cap=159 }, 1:{ span=1 cap=160 }
> [ 107.000070] CPU6 attaching sched-domain(s):
> [ 107.000082] domain-0: span=0-1,4,6 level=MC
> [ 107.000095] groups: 6:{ span=6 cap=995 }, 0:{ span=0 cap=159 }, 1:{
> span=1 cap=160 }, 4:{ span=4 cap=496 }
> [ 107.000721] root domain span: 0-1,4,6
> [ 107.001152] root_domain 2-3,5,7: pd6:{ cpus=6-7 nr_pstate=4 } pd4:{
> cpus=4-5 nr_pstate=4 } pd0:{ cpus=0-3 nr_pstate=4 }
> [ 107.001869] root_domain 0-1,4,6: pd6:{ cpus=6-7 nr_pstate=4 } pd4:{
> cpus=4-5 nr_pstate=4 } pd0:{ cpus=0-3 nr_pstate=4 }
>
> [...]
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-08-20 11:09 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-12 9:33 [RFC PATCH] sched/feec: Simplify the traversal of pd'cpus Xuewen Yan
2025-08-12 21:29 ` Christian Loehle
2025-08-14 8:46 ` Dietmar Eggemann
2025-08-14 9:52 ` Xuewen Yan
2025-08-15 13:01 ` Dietmar Eggemann
2025-08-18 11:05 ` Xuewen Yan
2025-08-18 15:24 ` Dietmar Eggemann
2025-08-19 2:02 ` Xuewen Yan
2025-08-19 14:01 ` Dietmar Eggemann
2025-08-20 11:09 ` Xuewen Yan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).