public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrea Righi <arighi@nvidia.com>
To: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Valentin Schneider <vschneid@redhat.com>,
	Christian Loehle <christian.loehle@arm.com>,
	Koba Ko <kobak@nvidia.com>,
	Felix Abecassis <fabecassis@nvidia.com>,
	Balbir Singh <balbirs@nvidia.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 0/4] sched/fair: SMT-aware asymmetric CPU capacity
Date: Tue, 7 Apr 2026 21:16:02 +0200	[thread overview]
Message-ID: <adVX8pRVRC1dAD1S@gpd4> (raw)
In-Reply-To: <e3ede6ad-f5d4-4c56-86f2-06bc6026ebfc@arm.com>

Hi Dietmar,

On Tue, Apr 07, 2026 at 01:50:51PM +0200, Dietmar Eggemann wrote:
> On 03.04.26 22:44, Andrea Righi wrote:
> > On Fri, Apr 03, 2026 at 04:46:03PM +0200, Andrea Righi wrote:
> >> On Fri, Apr 03, 2026 at 01:47:17PM +0200, Dietmar Eggemann wrote:
> > ...
> >>>> Looking at the data:
> >>>>  - SIS_UTIL doesn't seem relevant in this case (differences are within
> >>>>    error range),
> >>>>  - ASYM_CPU_CAPACITY seems to provide a small throughput gain, but it seems
> >>>>    more beneficial for tail latency reduction,
> >>>>  - the ILB SMT patch seems to slightly improve throughput, but the biggest
> >>>>    benefit is still coming from ASYM_CPU_CAPACITY.
> >>>
> >>>> Overall, also in this case it seems beneficial to use ASYM_CPU_CAPACITY
> >>>> rather than equalizing the capacities.
> >>>>
> >>>> That said, I'm still not sure why ASYM is helping. The frequency asymmetry
> >>>
> >>> OK, I still would be more comfortable with this when I would now why
> >>> this is :-)
> >>
> >> Working on this. :)
> > 
> > Alright, I think I found something. I tried to make sis() behave more like sic()
> > by adding the same SMT "full idle core" check in the fast path and removing the
> > extra select_idle_smt(prev) hop from the LLC idle path.
> > 
> > Essentially this:
> > 
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 7bebceb5ed9df..19fffa2df2d36 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -7651,29 +7651,6 @@ static int select_idle_core(struct task_struct *p, int core, struct cpumask *cpu
> >  	return -1;
> >  }
> >  
> > -/*
> > - * Scan the local SMT mask for idle CPUs.
> > - */
> > -static int select_idle_smt(struct task_struct *p, struct sched_domain *sd, int target)
> > -{
> > -	int cpu;
> > -
> > -	for_each_cpu_and(cpu, cpu_smt_mask(target), p->cpus_ptr) {
> > -		if (cpu == target)
> > -			continue;
> > -		/*
> > -		 * Check if the CPU is in the LLC scheduling domain of @target.
> > -		 * Due to isolcpus, there is no guarantee that all the siblings are in the domain.
> > -		 */
> > -		if (!cpumask_test_cpu(cpu, sched_domain_span(sd)))
> > -			continue;
> > -		if (available_idle_cpu(cpu) || sched_idle_cpu(cpu)) 
> > -			return cpu;
> 
> So it is this returning of CPU from the smt mask rather than the
> 
>     for_each_cpu_wrap(cpu, cpus, target + 1)
> 
>         __select_idle_cpu()
> 
>             if (choose_idle_cpu(cpu, p) && ...)
>                 return cpu
> 
> where cpus is cpumask_and(cpus, sched_domain_span(MC), p->cpus_ptr)

Right, and this is a different behavior that I was trying to eliminate from
sis() to make it similar to sic().

> 
> I wonder wether this has anything to do with your NVIDIA Spatial
> Multithreading (SMT) versus Traditional (time-shared resources) SMT?

I don't have data to prove or disprove that... it'd be interesting to try the
same approach on a system with traditional SMT.

> 
> 
> > -	}
> > -
> > -	return -1;
> > -}
> > -
> >  #else /* !CONFIG_SCHED_SMT: */
> >  
> >  static inline void set_idle_cores(int cpu, int val)
> > @@ -7690,11 +7667,6 @@ static inline int select_idle_core(struct task_struct *p, int core, struct cpuma
> >  	return __select_idle_cpu(core, p);
> >  }
> >  
> > -static inline int select_idle_smt(struct task_struct *p, struct sched_domain *sd, int target)
> > -{
> > -	return -1;
> > -}
> > -
> >  #endif /* !CONFIG_SCHED_SMT */
> >  
> >  /*
> > @@ -7859,7 +7831,7 @@ static inline bool asym_fits_cpu(unsigned long util,
> >  		       (util_fits_cpu(util, util_min, util_max, cpu) > 0);
> >  	}
> >  
> > -	return true;
> > +	return !sched_smt_active() || is_core_idle(cpu);
> >  }
> 
> This change seems to be orthogonal to the removal of select_idle_smt()
> for sis()?

Right, essentially this modifies sis() to return only if cpu is a fully-idle
core.

> 
> BTW, the is_core_idle() in asym_fits_cpu() (used for those early return
> CPU conditions in sis()) is something we don't have on the NO_ASYM side
> where we only use choose_idle_cpu().

You mean without this change? In that case, yes, because asym_fits_cpu() was
just a no-op. This is one of the behavior changes in sis() to make it similar to
sic() with SMT awareness.

> 
> >  /*
> > @@ -7964,16 +7936,9 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
> >  	if (!sd)
> >  		return target;
> >  
> > -	if (sched_smt_active()) {
> > +	if (sched_smt_active())
> >  		has_idle_core = test_idle_cores(target);
> >  
> > -		if (!has_idle_core && cpus_share_cache(prev, target)) {
> > -			i = select_idle_smt(p, sd, prev);
> > -			if ((unsigned int)i < nr_cpumask_bits)
> > -				return i;
> > -		}
> > -	}
> > -
> >  	i = select_idle_cpu(p, sd, has_idle_core, target);
> >  	if ((unsigned)i < nr_cpumask_bits)
> >  		return i;
> > 
> > ---
> > 
> > With this applied, I see identical performance between NO_ASYM and ASYM+SMT.
> 
> Interesting!
> 
> > I'm not suggesting to apply this, but that seems to be the reason why ASYM+SMT
> > performs better in my case.
> > 
> > -Andrea
> 

Thanks,
-Andrea

  reply	other threads:[~2026-04-07 19:16 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-26 15:02 [PATCH 0/4] sched/fair: SMT-aware asymmetric CPU capacity Andrea Righi
2026-03-26 15:02 ` [PATCH 1/4] sched/fair: Prefer fully-idle SMT cores in asym-capacity idle selection Andrea Righi
2026-03-27  8:09   ` Vincent Guittot
2026-03-27  9:46     ` Andrea Righi
2026-03-27 10:44   ` K Prateek Nayak
2026-03-27 10:58     ` Andrea Righi
2026-03-27 11:14       ` K Prateek Nayak
2026-03-27 16:39         ` Andrea Righi
2026-03-30 10:17           ` K Prateek Nayak
2026-03-30 13:07             ` Vincent Guittot
2026-03-30 13:22             ` Andrea Righi
2026-03-30 13:46               ` Andrea Righi
2026-03-26 15:02 ` [PATCH 2/4] sched/fair: Reject misfit pulls onto busy SMT siblings on asym-capacity Andrea Righi
2026-03-26 15:02 ` [PATCH 3/4] sched/fair: Enable EAS with SMT on SD_ASYM_CPUCAPACITY systems Andrea Righi
2026-03-27  8:09   ` Vincent Guittot
2026-03-27  9:45     ` Andrea Righi
2026-03-26 15:02 ` [PATCH 4/4] sched/fair: Prefer fully-idle SMT core for NOHZ idle load balancer Andrea Righi
2026-03-27  8:45   ` Vincent Guittot
2026-03-27  9:44     ` Andrea Righi
2026-03-27 11:34       ` K Prateek Nayak
2026-03-27 20:36         ` Andrea Righi
2026-03-27 22:45           ` Andrea Righi
2026-03-30 17:29         ` Andrea Righi
2026-03-27 13:44   ` Shrikanth Hegde
2026-03-26 16:33 ` [PATCH 0/4] sched/fair: SMT-aware asymmetric CPU capacity Christian Loehle
2026-03-27  6:52   ` Andrea Righi
2026-03-27 16:31 ` Shrikanth Hegde
2026-03-27 17:08   ` Andrea Righi
2026-03-28  6:51     ` Shrikanth Hegde
2026-03-28 13:03 ` Balbir Singh
2026-03-28 22:50   ` Andrea Righi
2026-03-29 21:36     ` Balbir Singh
2026-03-30 22:30 ` Dietmar Eggemann
2026-03-31  9:04   ` Andrea Righi
2026-04-01 11:57     ` Dietmar Eggemann
2026-04-01 12:08       ` Vincent Guittot
2026-04-01 12:42         ` Andrea Righi
2026-04-01 13:12           ` Andrea Righi
2026-04-03 11:47             ` Dietmar Eggemann
2026-04-03 14:45               ` Andrea Righi
2026-04-03 20:44                 ` Andrea Righi
2026-04-07 11:50                   ` Dietmar Eggemann
2026-04-07 19:16                     ` Andrea Righi [this message]
2026-04-03 11:47           ` Dietmar Eggemann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=adVX8pRVRC1dAD1S@gpd4 \
    --to=arighi@nvidia.com \
    --cc=balbirs@nvidia.com \
    --cc=bsegall@google.com \
    --cc=christian.loehle@arm.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=fabecassis@nvidia.com \
    --cc=juri.lelli@redhat.com \
    --cc=kobak@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox