Re: [PATCH 0/4] sched/fair: SMT-aware asymmetric CPU capacity

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Andrea Righi <arighi@nvidia.com>
To: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Valentin Schneider <vschneid@redhat.com>,
	Christian Loehle <christian.loehle@arm.com>,
	Koba Ko <kobak@nvidia.com>,
	Felix Abecassis <fabecassis@nvidia.com>,
	Balbir Singh <balbirs@nvidia.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 0/4] sched/fair: SMT-aware asymmetric CPU capacity
Date: Tue, 7 Apr 2026 21:16:02 +0200	[thread overview]
Message-ID: <adVX8pRVRC1dAD1S@gpd4> (raw)
In-Reply-To: <e3ede6ad-f5d4-4c56-86f2-06bc6026ebfc@arm.com>

Hi Dietmar,

On Tue, Apr 07, 2026 at 01:50:51PM +0200, Dietmar Eggemann wrote:
> On 03.04.26 22:44, Andrea Righi wrote:
> > On Fri, Apr 03, 2026 at 04:46:03PM +0200, Andrea Righi wrote:
> >> On Fri, Apr 03, 2026 at 01:47:17PM +0200, Dietmar Eggemann wrote:
> > ...
> >>>> Looking at the data:
> >>>>  - SIS_UTIL doesn't seem relevant in this case (differences are within
> >>>>    error range),
> >>>>  - ASYM_CPU_CAPACITY seems to provide a small throughput gain, but it seems
> >>>>    more beneficial for tail latency reduction,
> >>>>  - the ILB SMT patch seems to slightly improve throughput, but the biggest
> >>>>    benefit is still coming from ASYM_CPU_CAPACITY.
> >>>
> >>>> Overall, also in this case it seems beneficial to use ASYM_CPU_CAPACITY
> >>>> rather than equalizing the capacities.
> >>>>
> >>>> That said, I'm still not sure why ASYM is helping. The frequency asymmetry
> >>>
> >>> OK, I still would be more comfortable with this when I would now why
> >>> this is :-)
> >>
> >> Working on this. :)
> > 
> > Alright, I think I found something. I tried to make sis() behave more like sic()
> > by adding the same SMT "full idle core" check in the fast path and removing the
> > extra select_idle_smt(prev) hop from the LLC idle path.
> > 
> > Essentially this:
> > 
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 7bebceb5ed9df..19fffa2df2d36 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -7651,29 +7651,6 @@ static int select_idle_core(struct task_struct *p, int core, struct cpumask *cpu
> >  	return -1;
> >  }
> >  
> > -/*
> > - * Scan the local SMT mask for idle CPUs.
> > - */
> > -static int select_idle_smt(struct task_struct *p, struct sched_domain *sd, int target)
> > -{
> > -	int cpu;
> > -
> > -	for_each_cpu_and(cpu, cpu_smt_mask(target), p->cpus_ptr) {
> > -		if (cpu == target)
> > -			continue;
> > -		/*
> > -		 * Check if the CPU is in the LLC scheduling domain of @target.
> > -		 * Due to isolcpus, there is no guarantee that all the siblings are in the domain.
> > -		 */
> > -		if (!cpumask_test_cpu(cpu, sched_domain_span(sd)))
> > -			continue;
> > -		if (available_idle_cpu(cpu) || sched_idle_cpu(cpu)) 
> > -			return cpu;
> 
> So it is this returning of CPU from the smt mask rather than the
> 
>     for_each_cpu_wrap(cpu, cpus, target + 1)
> 
>         __select_idle_cpu()
> 
>             if (choose_idle_cpu(cpu, p) && ...)
>                 return cpu
> 
> where cpus is cpumask_and(cpus, sched_domain_span(MC), p->cpus_ptr)

Right, and this is a different behavior that I was trying to eliminate from
sis() to make it similar to sic().

> 
> I wonder wether this has anything to do with your NVIDIA Spatial
> Multithreading (SMT) versus Traditional (time-shared resources) SMT?

I don't have data to prove or disprove that... it'd be interesting to try the
same approach on a system with traditional SMT.

> 
> 
> > -	}
> > -
> > -	return -1;
> > -}
> > -
> >  #else /* !CONFIG_SCHED_SMT: */
> >  
> >  static inline void set_idle_cores(int cpu, int val)
> > @@ -7690,11 +7667,6 @@ static inline int select_idle_core(struct task_struct *p, int core, struct cpuma
> >  	return __select_idle_cpu(core, p);
> >  }
> >  
> > -static inline int select_idle_smt(struct task_struct *p, struct sched_domain *sd, int target)
> > -{
> > -	return -1;
> > -}
> > -
> >  #endif /* !CONFIG_SCHED_SMT */
> >  
> >  /*
> > @@ -7859,7 +7831,7 @@ static inline bool asym_fits_cpu(unsigned long util,
> >  		       (util_fits_cpu(util, util_min, util_max, cpu) > 0);
> >  	}
> >  
> > -	return true;
> > +	return !sched_smt_active() || is_core_idle(cpu);
> >  }
> 
> This change seems to be orthogonal to the removal of select_idle_smt()
> for sis()?

Right, essentially this modifies sis() to return only if cpu is a fully-idle
core.

> 
> BTW, the is_core_idle() in asym_fits_cpu() (used for those early return
> CPU conditions in sis()) is something we don't have on the NO_ASYM side
> where we only use choose_idle_cpu().

You mean without this change? In that case, yes, because asym_fits_cpu() was
just a no-op. This is one of the behavior changes in sis() to make it similar to
sic() with SMT awareness.

> 
> >  /*
> > @@ -7964,16 +7936,9 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
> >  	if (!sd)
> >  		return target;
> >  
> > -	if (sched_smt_active()) {
> > +	if (sched_smt_active())
> >  		has_idle_core = test_idle_cores(target);
> >  
> > -		if (!has_idle_core && cpus_share_cache(prev, target)) {
> > -			i = select_idle_smt(p, sd, prev);
> > -			if ((unsigned int)i < nr_cpumask_bits)
> > -				return i;
> > -		}
> > -	}
> > -
> >  	i = select_idle_cpu(p, sd, has_idle_core, target);
> >  	if ((unsigned)i < nr_cpumask_bits)
> >  		return i;
> > 
> > ---
> > 
> > With this applied, I see identical performance between NO_ASYM and ASYM+SMT.
> 
> Interesting!
> 
> > I'm not suggesting to apply this, but that seems to be the reason why ASYM+SMT
> > performs better in my case.
> > 
> > -Andrea
> 

Thanks,
-Andrea

next prev parent reply	other threads:[~2026-04-07 19:16 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-26 15:02 [PATCH 0/4] sched/fair: SMT-aware asymmetric CPU capacity Andrea Righi
2026-03-26 15:02 ` [PATCH 1/4] sched/fair: Prefer fully-idle SMT cores in asym-capacity idle selection Andrea Righi
2026-03-27  8:09   ` Vincent Guittot
2026-03-27  9:46     ` Andrea Righi
2026-03-27 10:44   ` K Prateek Nayak
2026-03-27 10:58     ` Andrea Righi
2026-03-27 11:14       ` K Prateek Nayak
2026-03-27 16:39         ` Andrea Righi
2026-03-30 10:17           ` K Prateek Nayak
2026-03-30 13:07             ` Vincent Guittot
2026-03-30 13:22             ` Andrea Righi
2026-03-30 13:46               ` Andrea Righi
2026-03-26 15:02 ` [PATCH 2/4] sched/fair: Reject misfit pulls onto busy SMT siblings on asym-capacity Andrea Righi
2026-03-26 15:02 ` [PATCH 3/4] sched/fair: Enable EAS with SMT on SD_ASYM_CPUCAPACITY systems Andrea Righi
2026-03-27  8:09   ` Vincent Guittot
2026-03-27  9:45     ` Andrea Righi
2026-03-26 15:02 ` [PATCH 4/4] sched/fair: Prefer fully-idle SMT core for NOHZ idle load balancer Andrea Righi
2026-03-27  8:45   ` Vincent Guittot
2026-03-27  9:44     ` Andrea Righi
2026-03-27 11:34       ` K Prateek Nayak
2026-03-27 20:36         ` Andrea Righi
2026-03-27 22:45           ` Andrea Righi
2026-03-30 17:29         ` Andrea Righi
2026-03-27 13:44   ` Shrikanth Hegde
2026-03-26 16:33 ` [PATCH 0/4] sched/fair: SMT-aware asymmetric CPU capacity Christian Loehle
2026-03-27  6:52   ` Andrea Righi
2026-03-27 16:31 ` Shrikanth Hegde
2026-03-27 17:08   ` Andrea Righi
2026-03-28  6:51     ` Shrikanth Hegde
2026-03-28 13:03 ` Balbir Singh
2026-03-28 22:50   ` Andrea Righi
2026-03-29 21:36     ` Balbir Singh
2026-03-30 22:30 ` Dietmar Eggemann
2026-03-31  9:04   ` Andrea Righi
2026-04-01 11:57     ` Dietmar Eggemann
2026-04-01 12:08       ` Vincent Guittot
2026-04-01 12:42         ` Andrea Righi
2026-04-01 13:12           ` Andrea Righi
2026-04-03 11:47             ` Dietmar Eggemann
2026-04-03 14:45               ` Andrea Righi
2026-04-03 20:44                 ` Andrea Righi
2026-04-07 11:50                   ` Dietmar Eggemann
2026-04-07 19:16                     ` Andrea Righi [this message]
2026-04-03 11:47           ` Dietmar Eggemann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=adVX8pRVRC1dAD1S@gpd4 \
    --to=arighi@nvidia.com \
    --cc=balbirs@nvidia.com \
    --cc=bsegall@google.com \
    --cc=christian.loehle@arm.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=fabecassis@nvidia.com \
    --cc=juri.lelli@redhat.com \
    --cc=kobak@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.