linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH] sched/feec: Simplify the traversal of pd'cpus
@ 2025-08-12  9:33 Xuewen Yan
  2025-08-12 21:29 ` Christian Loehle
  2025-08-14  8:46 ` Dietmar Eggemann
  0 siblings, 2 replies; 10+ messages in thread
From: Xuewen Yan @ 2025-08-12  9:33 UTC (permalink / raw)
  To: dietmar.eggemann, mingo, peterz, juri.lelli, vincent.guittot
  Cc: rostedt, bsegall, mgorman, vschneid, vdonnefort, ke.wang,
	xuewen.yan94, linux-kernel

Now we use for_each_cpu() to traversal all pd's cpus,
it is in order to compute the pd_cap. This approach may
result in some unnecessary judgments.
We can simply calculate pd_cap as follows:

pd_cap = cpu_actual_cap * cpumask_weight(pd_cpus);

Then we can AND pd'scpus, sd's cpus and task's cpus_ptr
before traversing, which can save some unnecessary judgment.

Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
---
 kernel/sched/fair.c | 14 ++++----------
 1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index b173a059315c..e47fe94d6889 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8330,18 +8330,12 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
 		cpu_actual_cap = get_actual_cpu_capacity(cpu);
 
 		eenv.cpu_cap = cpu_actual_cap;
-		eenv.pd_cap = 0;
+		eenv.pd_cap = cpu_actual_cap * cpumask_weight(cpus);
 
-		for_each_cpu(cpu, cpus) {
-			struct rq *rq = cpu_rq(cpu);
-
-			eenv.pd_cap += cpu_actual_cap;
-
-			if (!cpumask_test_cpu(cpu, sched_domain_span(sd)))
-				continue;
+		cpumask_and(cpus, cpus, sched_domain_span(sd));
 
-			if (!cpumask_test_cpu(cpu, p->cpus_ptr))
-				continue;
+		for_each_cpu_and(cpu, cpus, p->cpus_ptr) {
+			struct rq *rq = cpu_rq(cpu);
 
 			util = cpu_util(cpu, p, cpu, 0);
 			cpu_cap = capacity_of(cpu);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH] sched/feec: Simplify the traversal of pd'cpus
  2025-08-12  9:33 [RFC PATCH] sched/feec: Simplify the traversal of pd'cpus Xuewen Yan
@ 2025-08-12 21:29 ` Christian Loehle
  2025-08-14  8:46 ` Dietmar Eggemann
  1 sibling, 0 replies; 10+ messages in thread
From: Christian Loehle @ 2025-08-12 21:29 UTC (permalink / raw)
  To: Xuewen Yan, dietmar.eggemann, mingo, peterz, juri.lelli,
	vincent.guittot
  Cc: rostedt, bsegall, mgorman, vschneid, vdonnefort, ke.wang,
	xuewen.yan94, linux-kernel

On 8/12/25 10:33, Xuewen Yan wrote:
> Now we use for_each_cpu() to traversal all pd's cpus,
> it is in order to compute the pd_cap. This approach may
> result in some unnecessary judgments.
> We can simply calculate pd_cap as follows:
> 
> pd_cap = cpu_actual_cap * cpumask_weight(pd_cpus);
> 
> Then we can AND pd'scpus, sd's cpus and task's cpus_ptr
> before traversing, which can save some unnecessary judgment.

IMO this would be clearer if it's:
Calculate pd_cap as follows:
...

instead of traversing ...

Other than that LGTM.

> 
> Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
> ---
>  kernel/sched/fair.c | 14 ++++----------
>  1 file changed, 4 insertions(+), 10 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index b173a059315c..e47fe94d6889 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -8330,18 +8330,12 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
>  		cpu_actual_cap = get_actual_cpu_capacity(cpu);
>  
>  		eenv.cpu_cap = cpu_actual_cap;
> -		eenv.pd_cap = 0;
> +		eenv.pd_cap = cpu_actual_cap * cpumask_weight(cpus);
>  
> -		for_each_cpu(cpu, cpus) {
> -			struct rq *rq = cpu_rq(cpu);
> -
> -			eenv.pd_cap += cpu_actual_cap;
> -
> -			if (!cpumask_test_cpu(cpu, sched_domain_span(sd)))
> -				continue;
> +		cpumask_and(cpus, cpus, sched_domain_span(sd));
>  
> -			if (!cpumask_test_cpu(cpu, p->cpus_ptr))
> -				continue;
> +		for_each_cpu_and(cpu, cpus, p->cpus_ptr) {
> +			struct rq *rq = cpu_rq(cpu);
>  
>  			util = cpu_util(cpu, p, cpu, 0);
>  			cpu_cap = capacity_of(cpu);


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH] sched/feec: Simplify the traversal of pd'cpus
  2025-08-12  9:33 [RFC PATCH] sched/feec: Simplify the traversal of pd'cpus Xuewen Yan
  2025-08-12 21:29 ` Christian Loehle
@ 2025-08-14  8:46 ` Dietmar Eggemann
  2025-08-14  9:52   ` Xuewen Yan
  1 sibling, 1 reply; 10+ messages in thread
From: Dietmar Eggemann @ 2025-08-14  8:46 UTC (permalink / raw)
  To: Xuewen Yan, mingo, peterz, juri.lelli, vincent.guittot
  Cc: rostedt, bsegall, mgorman, vschneid, vdonnefort, ke.wang,
	xuewen.yan94, linux-kernel



On 12.08.25 10:33, Xuewen Yan wrote:
> Now we use for_each_cpu() to traversal all pd's cpus,
> it is in order to compute the pd_cap. This approach may
> result in some unnecessary judgments.
> We can simply calculate pd_cap as follows:
> 
> pd_cap = cpu_actual_cap * cpumask_weight(pd_cpus);
> 
> Then we can AND pd'scpus, sd's cpus and task's cpus_ptr
> before traversing, which can save some unnecessary judgment.
> 
> Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
> ---
>  kernel/sched/fair.c | 14 ++++----------
>  1 file changed, 4 insertions(+), 10 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index b173a059315c..e47fe94d6889 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -8330,18 +8330,12 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)

Just a thought ...

for (; pd; pd = pd->next)

  cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);    <-- (1)
  cpumask_and(cpus, perf_domain_span(pd), cpu_online_mask);


  if (cpumask_empty(cpus))
    continue;                                               <-- (2)

Can you not mask cpus already early in the pd loop (1) and then profit
from (2) in these rare cases? IIRC, the sd only plays a role here in
exclusive cpusets scenarios which I don't thing anybody deploys with EAS?

>  		cpu_actual_cap = get_actual_cpu_capacity(cpu);
>  
>  		eenv.cpu_cap = cpu_actual_cap;
> -		eenv.pd_cap = 0;
> +		eenv.pd_cap = cpu_actual_cap * cpumask_weight(cpus);
>  
> -		for_each_cpu(cpu, cpus) {
> -			struct rq *rq = cpu_rq(cpu);
> -
> -			eenv.pd_cap += cpu_actual_cap;
> -
> -			if (!cpumask_test_cpu(cpu, sched_domain_span(sd)))
> -				continue;
> +		cpumask_and(cpus, cpus, sched_domain_span(sd));
>  
> -			if (!cpumask_test_cpu(cpu, p->cpus_ptr))
> -				continue;
> +		for_each_cpu_and(cpu, cpus, p->cpus_ptr) {
> +			struct rq *rq = cpu_rq(cpu);
>  
>  			util = cpu_util(cpu, p, cpu, 0);
>  			cpu_cap = capacity_of(cpu);


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH] sched/feec: Simplify the traversal of pd'cpus
  2025-08-14  8:46 ` Dietmar Eggemann
@ 2025-08-14  9:52   ` Xuewen Yan
  2025-08-15 13:01     ` Dietmar Eggemann
  0 siblings, 1 reply; 10+ messages in thread
From: Xuewen Yan @ 2025-08-14  9:52 UTC (permalink / raw)
  To: Dietmar Eggemann, Christian Loehle
  Cc: Xuewen Yan, mingo, peterz, juri.lelli, vincent.guittot, rostedt,
	bsegall, mgorman, vschneid, vdonnefort, ke.wang, linux-kernel

Hi Dietmar,

On Thu, Aug 14, 2025 at 4:46 PM Dietmar Eggemann
<dietmar.eggemann@arm.com> wrote:
>
>
>
> On 12.08.25 10:33, Xuewen Yan wrote:
> > Now we use for_each_cpu() to traversal all pd's cpus,
> > it is in order to compute the pd_cap. This approach may
> > result in some unnecessary judgments.
> > We can simply calculate pd_cap as follows:
> >
> > pd_cap = cpu_actual_cap * cpumask_weight(pd_cpus);
> >
> > Then we can AND pd'scpus, sd's cpus and task's cpus_ptr
> > before traversing, which can save some unnecessary judgment.
> >
> > Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
> > ---
> >  kernel/sched/fair.c | 14 ++++----------
> >  1 file changed, 4 insertions(+), 10 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index b173a059315c..e47fe94d6889 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -8330,18 +8330,12 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
>
> Just a thought ...
>
> for (; pd; pd = pd->next)
>
>   cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);    <-- (1)
>   cpumask_and(cpus, perf_domain_span(pd), cpu_online_mask);
>
>
>   if (cpumask_empty(cpus))
>     continue;                                               <-- (2)
>
> Can you not mask cpus already early in the pd loop (1) and then profit
> from (2) in these rare cases?

I do not think the cpus_ptr chould place before the pd_cap calc,
because the following scenario should be considered:
the task's cpus_ptr cpus: 0,1,2,3
pd's cpus: 0,1,2,3,4,5,6
the pd's cap = cpu_cap * 6;
if we cpumask_and(pd'scpus, p->cpus_ptr),
the cpumask_weight = 4,
the pd's cap = cpu_cap *4.


> IIRC, the sd only plays a role here in
> exclusive cpusets scenarios which I don't thing anybody deploys with EAS?

I am also wondering if the check for SD's CPUs could be removed...

Thanks!
>
> >               cpu_actual_cap = get_actual_cpu_capacity(cpu);
> >
> >               eenv.cpu_cap = cpu_actual_cap;
> > -             eenv.pd_cap = 0;
> > +             eenv.pd_cap = cpu_actual_cap * cpumask_weight(cpus);
> >
> > -             for_each_cpu(cpu, cpus) {
> > -                     struct rq *rq = cpu_rq(cpu);
> > -
> > -                     eenv.pd_cap += cpu_actual_cap;
> > -
> > -                     if (!cpumask_test_cpu(cpu, sched_domain_span(sd)))
> > -                             continue;
> > +             cpumask_and(cpus, cpus, sched_domain_span(sd));
> >
> > -                     if (!cpumask_test_cpu(cpu, p->cpus_ptr))
> > -                             continue;
> > +             for_each_cpu_and(cpu, cpus, p->cpus_ptr) {
> > +                     struct rq *rq = cpu_rq(cpu);
> >
> >                       util = cpu_util(cpu, p, cpu, 0);
> >                       cpu_cap = capacity_of(cpu);
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH] sched/feec: Simplify the traversal of pd'cpus
  2025-08-14  9:52   ` Xuewen Yan
@ 2025-08-15 13:01     ` Dietmar Eggemann
  2025-08-18 11:05       ` Xuewen Yan
  0 siblings, 1 reply; 10+ messages in thread
From: Dietmar Eggemann @ 2025-08-15 13:01 UTC (permalink / raw)
  To: Xuewen Yan, Christian Loehle
  Cc: Xuewen Yan, mingo, peterz, juri.lelli, vincent.guittot, rostedt,
	bsegall, mgorman, vschneid, vdonnefort, ke.wang, linux-kernel

On 14.08.25 10:52, Xuewen Yan wrote:
> Hi Dietmar,
> 
> On Thu, Aug 14, 2025 at 4:46 PM Dietmar Eggemann
> <dietmar.eggemann@arm.com> wrote:
>>
>> On 12.08.25 10:33, Xuewen Yan wrote:

[...]

>> Can you not mask cpus already early in the pd loop (1) and then profit
>> from (2) in these rare cases?
> 
> I do not think the cpus_ptr chould place before the pd_cap calc,
> because the following scenario should be considered:
> the task's cpus_ptr cpus: 0,1,2,3
> pd's cpus: 0,1,2,3,4,5,6
> the pd's cap = cpu_cap * 6;
> if we cpumask_and(pd'scpus, p->cpus_ptr),
> the cpumask_weight = 4,
> the pd's cap = cpu_cap *4.

Yes, you're right! Missed this one.

>> IIRC, the sd only plays a role here in
>> exclusive cpusets scenarios which I don't thing anybody deploys with EAS?
> 
> I am also wondering if the check for SD's CPUs could be removed...

Still not 100% sure here. I would have to play with cpusets and EAS a
little bit more. Are you thinking that in those cases p->cpus_ptr
already covers the cpuset restriction so that the sd mask isn't necessary?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH] sched/feec: Simplify the traversal of pd'cpus
  2025-08-15 13:01     ` Dietmar Eggemann
@ 2025-08-18 11:05       ` Xuewen Yan
  2025-08-18 15:24         ` Dietmar Eggemann
  0 siblings, 1 reply; 10+ messages in thread
From: Xuewen Yan @ 2025-08-18 11:05 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: Christian Loehle, Xuewen Yan, mingo, peterz, juri.lelli,
	vincent.guittot, rostedt, bsegall, mgorman, vschneid, vdonnefort,
	ke.wang, linux-kernel

On Fri, Aug 15, 2025 at 9:01 PM Dietmar Eggemann
<dietmar.eggemann@arm.com> wrote:
>
> On 14.08.25 10:52, Xuewen Yan wrote:
> > Hi Dietmar,
> >
> > On Thu, Aug 14, 2025 at 4:46 PM Dietmar Eggemann
> > <dietmar.eggemann@arm.com> wrote:
> >>
> >> On 12.08.25 10:33, Xuewen Yan wrote:
>
> [...]
>
> >> Can you not mask cpus already early in the pd loop (1) and then profit
> >> from (2) in these rare cases?
> >
> > I do not think the cpus_ptr chould place before the pd_cap calc,
> > because the following scenario should be considered:
> > the task's cpus_ptr cpus: 0,1,2,3
> > pd's cpus: 0,1,2,3,4,5,6
> > the pd's cap = cpu_cap * 6;
> > if we cpumask_and(pd'scpus, p->cpus_ptr),
> > the cpumask_weight = 4,
> > the pd's cap = cpu_cap *4.
>
> Yes, you're right! Missed this one.
>
> >> IIRC, the sd only plays a role here in
> >> exclusive cpusets scenarios which I don't thing anybody deploys with EAS?
> >
> > I am also wondering if the check for SD's CPUs could be removed...
>
> Still not 100% sure here. I would have to play with cpusets and EAS a
> little bit more. Are you thinking that in those cases p->cpus_ptr
> already covers the cpuset restriction so that the sd mask isn't necessary?

I am not familiar with cpuset, so I can't guarantee this. Similarly, I
also need to learn more about cpuset and cpu topology before I can
answer this question.

Thanks!

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH] sched/feec: Simplify the traversal of pd'cpus
  2025-08-18 11:05       ` Xuewen Yan
@ 2025-08-18 15:24         ` Dietmar Eggemann
  2025-08-19  2:02           ` Xuewen Yan
  0 siblings, 1 reply; 10+ messages in thread
From: Dietmar Eggemann @ 2025-08-18 15:24 UTC (permalink / raw)
  To: Xuewen Yan
  Cc: Christian Loehle, Xuewen Yan, mingo, peterz, juri.lelli,
	vincent.guittot, rostedt, bsegall, mgorman, vschneid, vdonnefort,
	ke.wang, linux-kernel

On 18.08.25 12:05, Xuewen Yan wrote:
> On Fri, Aug 15, 2025 at 9:01 PM Dietmar Eggemann
> <dietmar.eggemann@arm.com> wrote:
>>
>> On 14.08.25 10:52, Xuewen Yan wrote:
>>> Hi Dietmar,
>>>
>>> On Thu, Aug 14, 2025 at 4:46 PM Dietmar Eggemann
>>> <dietmar.eggemann@arm.com> wrote:
>>>>
>>>> On 12.08.25 10:33, Xuewen Yan wrote:
>>
>> [...]
>>
>>>> Can you not mask cpus already early in the pd loop (1) and then profit
>>>> from (2) in these rare cases?
>>>
>>> I do not think the cpus_ptr chould place before the pd_cap calc,
>>> because the following scenario should be considered:
>>> the task's cpus_ptr cpus: 0,1,2,3
>>> pd's cpus: 0,1,2,3,4,5,6
>>> the pd's cap = cpu_cap * 6;
>>> if we cpumask_and(pd'scpus, p->cpus_ptr),
>>> the cpumask_weight = 4,
>>> the pd's cap = cpu_cap *4.
>>
>> Yes, you're right! Missed this one.
>>
>>>> IIRC, the sd only plays a role here in
>>>> exclusive cpusets scenarios which I don't thing anybody deploys with EAS?
>>>
>>> I am also wondering if the check for SD's CPUs could be removed...
>>
>> Still not 100% sure here. I would have to play with cpusets and EAS a
>> little bit more. Are you thinking that in those cases p->cpus_ptr
>> already covers the cpuset restriction so that the sd mask isn't necessary?
> 
> I am not familiar with cpuset, so I can't guarantee this. Similarly, I
> also need to learn more about cpuset and cpu topology before I can
> answer this question.

Looks like we do need also the sd cpumask here.

Consider this tri-gear system:

#  cat /sys/devices/system/cpu/cpu*/cpu_capacity
160
160
160
160
498
498
1024
1024

and 2 exclusive cpusets cs1={0-1,4,6} and cs2={2-3,5,7}, so EAS is
possible in all 3 root_domains (/, /cs1, /cs2):

...
[   74.691104] CPU1 attaching sched-domain(s):
[   74.691180]  domain-0: span=0-1 level=MC
[   74.691244]   groups: 1:{ span=1 cap=159 }, 0:{ span=0 cap=155 }
[   74.693453]   domain-1: span=0-1,4,6 level=PKG
[   74.693534]    groups: 0:{ span=0-1 cap=314 }, 4:{ span=4 cap=496 },
6:{ span=6 cap=986 }
...
[   74.697890] root domain span: 0-1,4,6
[   74.697994] root_domain 2-3,5,7: pd6:{ cpus=6-7 nr_pstate=4 } pd4:{
cpus=4-5 nr_pstate=4 } pd0:{ cpus=0-3 nr_pstate=4 }
[   74.698922] root_domain 0-1,4,6: pd6:{ cpus=6-7 nr_pstate=4 } pd4:{
cpus=4-5 nr_pstate=4 } pd0:{ cpus=0-3 nr_pstate=4 }


  sd = rcu_dereference(*this_cpu_ptr(&sd_asym_cpucapacity));


Tasks running in '/' only have the sd to reduce the CPU affinity correctly.

...
[001] 5290.935663: select_task_rq_fair: kworker/u33:3 382 prev_cpu=0
[001] 5290.935696: select_task_rq_fair: kworker/u33:3 382 prev_cpu=0
pd=6-7 online=0-7 sd=0-1,4,6 cpus_ptr=0-7
[001] 5290.935753: select_task_rq_fair: kworker/u33:3 382 prev_cpu=0
pd=4-5 online=0-7 sd=0-1,4,6 cpus_ptr=0-7
[001] 5290.935779: select_task_rq_fair: kworker/u33:3 382 prev_cpu=0
pd=0-3 online=0-7 sd=0-1,4,6 cpus_ptr=0-7
...

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH] sched/feec: Simplify the traversal of pd'cpus
  2025-08-18 15:24         ` Dietmar Eggemann
@ 2025-08-19  2:02           ` Xuewen Yan
  2025-08-19 14:01             ` Dietmar Eggemann
  0 siblings, 1 reply; 10+ messages in thread
From: Xuewen Yan @ 2025-08-19  2:02 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: Christian Loehle, Xuewen Yan, mingo, peterz, juri.lelli,
	vincent.guittot, rostedt, bsegall, mgorman, vschneid, vdonnefort,
	ke.wang, linux-kernel

On Mon, Aug 18, 2025 at 11:24 PM Dietmar Eggemann
<dietmar.eggemann@arm.com> wrote:
>
> On 18.08.25 12:05, Xuewen Yan wrote:
> > On Fri, Aug 15, 2025 at 9:01 PM Dietmar Eggemann
> > <dietmar.eggemann@arm.com> wrote:
> >>
> >> On 14.08.25 10:52, Xuewen Yan wrote:
> >>> Hi Dietmar,
> >>>
> >>> On Thu, Aug 14, 2025 at 4:46 PM Dietmar Eggemann
> >>> <dietmar.eggemann@arm.com> wrote:
> >>>>
> >>>> On 12.08.25 10:33, Xuewen Yan wrote:
> >>
> >> [...]
> >>
> >>>> Can you not mask cpus already early in the pd loop (1) and then profit
> >>>> from (2) in these rare cases?
> >>>
> >>> I do not think the cpus_ptr chould place before the pd_cap calc,
> >>> because the following scenario should be considered:
> >>> the task's cpus_ptr cpus: 0,1,2,3
> >>> pd's cpus: 0,1,2,3,4,5,6
> >>> the pd's cap = cpu_cap * 6;
> >>> if we cpumask_and(pd'scpus, p->cpus_ptr),
> >>> the cpumask_weight = 4,
> >>> the pd's cap = cpu_cap *4.
> >>
> >> Yes, you're right! Missed this one.
> >>
> >>>> IIRC, the sd only plays a role here in
> >>>> exclusive cpusets scenarios which I don't thing anybody deploys with EAS?
> >>>
> >>> I am also wondering if the check for SD's CPUs could be removed...
> >>
> >> Still not 100% sure here. I would have to play with cpusets and EAS a
> >> little bit more. Are you thinking that in those cases p->cpus_ptr
> >> already covers the cpuset restriction so that the sd mask isn't necessary?
> >
> > I am not familiar with cpuset, so I can't guarantee this. Similarly, I
> > also need to learn more about cpuset and cpu topology before I can
> > answer this question.
>
> Looks like we do need also the sd cpumask here.
>
> Consider this tri-gear system:
>
> #  cat /sys/devices/system/cpu/cpu*/cpu_capacity
> 160
> 160
> 160
> 160
> 498
> 498
> 1024
> 1024
>
> and 2 exclusive cpusets cs1={0-1,4,6} and cs2={2-3,5,7}, so EAS is
> possible in all 3 root_domains (/, /cs1, /cs2):

Isn't your CPU an ARM Dynamiq architecture?
In my understanding, for Dynamiq arch, there is only one MC domain...
Did I miss something?

Thanks!
>
> ...
> [   74.691104] CPU1 attaching sched-domain(s):
> [   74.691180]  domain-0: span=0-1 level=MC
> [   74.691244]   groups: 1:{ span=1 cap=159 }, 0:{ span=0 cap=155 }
> [   74.693453]   domain-1: span=0-1,4,6 level=PKG
> [   74.693534]    groups: 0:{ span=0-1 cap=314 }, 4:{ span=4 cap=496 },
> 6:{ span=6 cap=986 }
> ...
> [   74.697890] root domain span: 0-1,4,6
> [   74.697994] root_domain 2-3,5,7: pd6:{ cpus=6-7 nr_pstate=4 } pd4:{
> cpus=4-5 nr_pstate=4 } pd0:{ cpus=0-3 nr_pstate=4 }
> [   74.698922] root_domain 0-1,4,6: pd6:{ cpus=6-7 nr_pstate=4 } pd4:{
> cpus=4-5 nr_pstate=4 } pd0:{ cpus=0-3 nr_pstate=4 }
>
>
>   sd = rcu_dereference(*this_cpu_ptr(&sd_asym_cpucapacity));
>
>
> Tasks running in '/' only have the sd to reduce the CPU affinity correctly.
>
> ...
> [001] 5290.935663: select_task_rq_fair: kworker/u33:3 382 prev_cpu=0
> [001] 5290.935696: select_task_rq_fair: kworker/u33:3 382 prev_cpu=0
> pd=6-7 online=0-7 sd=0-1,4,6 cpus_ptr=0-7
> [001] 5290.935753: select_task_rq_fair: kworker/u33:3 382 prev_cpu=0
> pd=4-5 online=0-7 sd=0-1,4,6 cpus_ptr=0-7
> [001] 5290.935779: select_task_rq_fair: kworker/u33:3 382 prev_cpu=0
> pd=0-3 online=0-7 sd=0-1,4,6 cpus_ptr=0-7
> ...

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH] sched/feec: Simplify the traversal of pd'cpus
  2025-08-19  2:02           ` Xuewen Yan
@ 2025-08-19 14:01             ` Dietmar Eggemann
  2025-08-20 11:09               ` Xuewen Yan
  0 siblings, 1 reply; 10+ messages in thread
From: Dietmar Eggemann @ 2025-08-19 14:01 UTC (permalink / raw)
  To: Xuewen Yan
  Cc: Christian Loehle, Xuewen Yan, mingo, peterz, juri.lelli,
	vincent.guittot, rostedt, bsegall, mgorman, vschneid, vdonnefort,
	ke.wang, linux-kernel

On 19.08.25 03:02, Xuewen Yan wrote:
> On Mon, Aug 18, 2025 at 11:24 PM Dietmar Eggemann
> <dietmar.eggemann@arm.com> wrote:
>>
>> On 18.08.25 12:05, Xuewen Yan wrote:
>>> On Fri, Aug 15, 2025 at 9:01 PM Dietmar Eggemann
>>> <dietmar.eggemann@arm.com> wrote:
>>>>
>>>> On 14.08.25 10:52, Xuewen Yan wrote:
>>>>> Hi Dietmar,
>>>>>
>>>>> On Thu, Aug 14, 2025 at 4:46 PM Dietmar Eggemann
>>>>> <dietmar.eggemann@arm.com> wrote:
>>>>>>
>>>>>> On 12.08.25 10:33, Xuewen Yan wrote:

[...]

>> Looks like we do need also the sd cpumask here.
>>
>> Consider this tri-gear system:
>>
>> #  cat /sys/devices/system/cpu/cpu*/cpu_capacity
>> 160
>> 160
>> 160
>> 160
>> 498
>> 498
>> 1024
>> 1024
>>
>> and 2 exclusive cpusets cs1={0-1,4,6} and cs2={2-3,5,7}, so EAS is
>> possible in all 3 root_domains (/, /cs1, /cs2):
> 
> Isn't your CPU an ARM Dynamiq architecture?
> In my understanding, for Dynamiq arch, there is only one MC domain...
> Did I miss something?

Ah, should have mentioned that this is qemu. I used a dts file
(qemu-system-aarch64 ... -dtb foo.dtb) with individual
'next-level-cache' entries for the CPUs {0-3}, {4-5} and {6-7} so that's
why you see MC & PKG. Removing those gives you a system with only MC:

[  106.986828] CPU2 attaching sched-domain(s):
[  106.987846]  domain-0: span=2-3,5,7 level=MC
[  106.987941]   groups: 2:{ span=2 cap=159 }, 3:{ span=3 cap=154 }, 5:{
span=5 cap=495 }, 7:{ span=7 cap=991 }
[  106.988842] CPU3 attaching sched-domain(s):
[  106.989096]  domain-0: span=2-3,5,7 level=MC
[  106.989136]   groups: 3:{ span=3 cap=154 }, 5:{ span=5 cap=495 }, 7:{
span=7 cap=991 }, 2:{ span=2 cap=159 }
[  106.989679] CPU5 attaching sched-domain(s):
[  106.989692]  domain-0: span=2-3,5,7 level=MC
[  106.989773]   groups: 5:{ span=5 cap=495 }, 7:{ span=7 cap=991 }, 2:{
span=2 cap=159 }, 3:{ span=3 cap=154 }
[  106.990466] CPU7 attaching sched-domain(s):
[  106.990482]  domain-0: span=2-3,5,7 level=MC
[  106.990632]   groups: 7:{ span=7 cap=991 }, 2:{ span=2 cap=159 }, 3:{
span=3 cap=154 }, 5:{ span=5 cap=495 }
[  106.997604] root domain span: 2-3,5,7
[  106.998267] CPU0 attaching sched-domain(s):
[  106.998278]  domain-0: span=0-1,4,6 level=MC
[  106.998295]   groups: 0:{ span=0 cap=159 }, 1:{ span=1 cap=160 }, 4:{
span=4 cap=496 }, 6:{ span=6 cap=995 }
[  106.998584] CPU1 attaching sched-domain(s):
[  106.998592]  domain-0: span=0-1,4,6 level=MC
[  106.998604]   groups: 1:{ span=1 cap=160 }, 4:{ span=4 cap=496 }, 6:{
span=6 cap=995 }, 0:{ span=0 cap=159 }
[  106.999477] CPU4 attaching sched-domain(s):
[  106.999487]  domain-0: span=0-1,4,6 level=MC
[  106.999504]   groups: 4:{ span=4 cap=496 }, 6:{ span=6 cap=995 }, 0:{
span=0 cap=159 }, 1:{ span=1 cap=160 }
[  107.000070] CPU6 attaching sched-domain(s):
[  107.000082]  domain-0: span=0-1,4,6 level=MC
[  107.000095]   groups: 6:{ span=6 cap=995 }, 0:{ span=0 cap=159 }, 1:{
span=1 cap=160 }, 4:{ span=4 cap=496 }
[  107.000721] root domain span: 0-1,4,6
[  107.001152] root_domain 2-3,5,7: pd6:{ cpus=6-7 nr_pstate=4 } pd4:{
cpus=4-5 nr_pstate=4 } pd0:{ cpus=0-3 nr_pstate=4 }
[  107.001869] root_domain 0-1,4,6: pd6:{ cpus=6-7 nr_pstate=4 } pd4:{
cpus=4-5 nr_pstate=4 } pd0:{ cpus=0-3 nr_pstate=4 }

[...]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH] sched/feec: Simplify the traversal of pd'cpus
  2025-08-19 14:01             ` Dietmar Eggemann
@ 2025-08-20 11:09               ` Xuewen Yan
  0 siblings, 0 replies; 10+ messages in thread
From: Xuewen Yan @ 2025-08-20 11:09 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: Christian Loehle, Xuewen Yan, mingo, peterz, juri.lelli,
	vincent.guittot, rostedt, bsegall, mgorman, vschneid, vdonnefort,
	ke.wang, linux-kernel

On Tue, Aug 19, 2025 at 10:01 PM Dietmar Eggemann
<dietmar.eggemann@arm.com> wrote:
>
> On 19.08.25 03:02, Xuewen Yan wrote:
> > On Mon, Aug 18, 2025 at 11:24 PM Dietmar Eggemann
> > <dietmar.eggemann@arm.com> wrote:
> >>
> >> On 18.08.25 12:05, Xuewen Yan wrote:
> >>> On Fri, Aug 15, 2025 at 9:01 PM Dietmar Eggemann
> >>> <dietmar.eggemann@arm.com> wrote:
> >>>>
> >>>> On 14.08.25 10:52, Xuewen Yan wrote:
> >>>>> Hi Dietmar,
> >>>>>
> >>>>> On Thu, Aug 14, 2025 at 4:46 PM Dietmar Eggemann
> >>>>> <dietmar.eggemann@arm.com> wrote:
> >>>>>>
> >>>>>> On 12.08.25 10:33, Xuewen Yan wrote:
>
> [...]
>
> >> Looks like we do need also the sd cpumask here.
> >>
> >> Consider this tri-gear system:
> >>
> >> #  cat /sys/devices/system/cpu/cpu*/cpu_capacity
> >> 160
> >> 160
> >> 160
> >> 160
> >> 498
> >> 498
> >> 1024
> >> 1024
> >>
> >> and 2 exclusive cpusets cs1={0-1,4,6} and cs2={2-3,5,7}, so EAS is
> >> possible in all 3 root_domains (/, /cs1, /cs2):
> >
> > Isn't your CPU an ARM Dynamiq architecture?
> > In my understanding, for Dynamiq arch, there is only one MC domain...
> > Did I miss something?
>
> Ah, should have mentioned that this is qemu. I used a dts file
> (qemu-system-aarch64 ... -dtb foo.dtb) with individual
> 'next-level-cache' entries for the CPUs {0-3}, {4-5} and {6-7} so that's
> why you see MC & PKG. Removing those gives you a system with only MC:

Thank you for your patience in explaining:)
Looks like we do need also the sd cpumask here.

Thanks!

>
> [  106.986828] CPU2 attaching sched-domain(s):
> [  106.987846]  domain-0: span=2-3,5,7 level=MC
> [  106.987941]   groups: 2:{ span=2 cap=159 }, 3:{ span=3 cap=154 }, 5:{
> span=5 cap=495 }, 7:{ span=7 cap=991 }
> [  106.988842] CPU3 attaching sched-domain(s):
> [  106.989096]  domain-0: span=2-3,5,7 level=MC
> [  106.989136]   groups: 3:{ span=3 cap=154 }, 5:{ span=5 cap=495 }, 7:{
> span=7 cap=991 }, 2:{ span=2 cap=159 }
> [  106.989679] CPU5 attaching sched-domain(s):
> [  106.989692]  domain-0: span=2-3,5,7 level=MC
> [  106.989773]   groups: 5:{ span=5 cap=495 }, 7:{ span=7 cap=991 }, 2:{
> span=2 cap=159 }, 3:{ span=3 cap=154 }
> [  106.990466] CPU7 attaching sched-domain(s):
> [  106.990482]  domain-0: span=2-3,5,7 level=MC
> [  106.990632]   groups: 7:{ span=7 cap=991 }, 2:{ span=2 cap=159 }, 3:{
> span=3 cap=154 }, 5:{ span=5 cap=495 }
> [  106.997604] root domain span: 2-3,5,7
> [  106.998267] CPU0 attaching sched-domain(s):
> [  106.998278]  domain-0: span=0-1,4,6 level=MC
> [  106.998295]   groups: 0:{ span=0 cap=159 }, 1:{ span=1 cap=160 }, 4:{
> span=4 cap=496 }, 6:{ span=6 cap=995 }
> [  106.998584] CPU1 attaching sched-domain(s):
> [  106.998592]  domain-0: span=0-1,4,6 level=MC
> [  106.998604]   groups: 1:{ span=1 cap=160 }, 4:{ span=4 cap=496 }, 6:{
> span=6 cap=995 }, 0:{ span=0 cap=159 }
> [  106.999477] CPU4 attaching sched-domain(s):
> [  106.999487]  domain-0: span=0-1,4,6 level=MC
> [  106.999504]   groups: 4:{ span=4 cap=496 }, 6:{ span=6 cap=995 }, 0:{
> span=0 cap=159 }, 1:{ span=1 cap=160 }
> [  107.000070] CPU6 attaching sched-domain(s):
> [  107.000082]  domain-0: span=0-1,4,6 level=MC
> [  107.000095]   groups: 6:{ span=6 cap=995 }, 0:{ span=0 cap=159 }, 1:{
> span=1 cap=160 }, 4:{ span=4 cap=496 }
> [  107.000721] root domain span: 0-1,4,6
> [  107.001152] root_domain 2-3,5,7: pd6:{ cpus=6-7 nr_pstate=4 } pd4:{
> cpus=4-5 nr_pstate=4 } pd0:{ cpus=0-3 nr_pstate=4 }
> [  107.001869] root_domain 0-1,4,6: pd6:{ cpus=6-7 nr_pstate=4 } pd4:{
> cpus=4-5 nr_pstate=4 } pd0:{ cpus=0-3 nr_pstate=4 }
>
> [...]

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-08-20 11:09 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-12  9:33 [RFC PATCH] sched/feec: Simplify the traversal of pd'cpus Xuewen Yan
2025-08-12 21:29 ` Christian Loehle
2025-08-14  8:46 ` Dietmar Eggemann
2025-08-14  9:52   ` Xuewen Yan
2025-08-15 13:01     ` Dietmar Eggemann
2025-08-18 11:05       ` Xuewen Yan
2025-08-18 15:24         ` Dietmar Eggemann
2025-08-19  2:02           ` Xuewen Yan
2025-08-19 14:01             ` Dietmar Eggemann
2025-08-20 11:09               ` Xuewen Yan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).