From: Andrea Righi <arighi@nvidia.com>
To: Shrikanth Hegde <sshegde@linux.ibm.com>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
Valentin Schneider <vschneid@redhat.com>,
K Prateek Nayak <kprateek.nayak@amd.com>,
Christian Loehle <christian.loehle@arm.com>,
Phil Auld <pauld@redhat.com>, Koba Ko <kobak@nvidia.com>,
Felix Abecassis <fabecassis@nvidia.com>,
Balbir Singh <balbirs@nvidia.com>,
Joel Fernandes <joelagnelf@nvidia.com>,
linux-kernel@vger.kernel.org, Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Juri Lelli <juri.lelli@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>
Subject: Re: [PATCH 4/5] sched/fair: Reject misfit pulls onto busy SMT siblings on asym-capacity
Date: Sat, 16 May 2026 11:04:25 +0200 [thread overview]
Message-ID: <aggzGRmNzFfdPJdq@gpd4> (raw)
In-Reply-To: <b6c72c48-e24e-49fe-9814-8c4334d70e1d@linux.ibm.com>
Hi Shrikanth,
On Fri, May 15, 2026 at 03:39:55PM +0530, Shrikanth Hegde wrote:
> On 5/9/26 11:37 PM, Andrea Righi wrote:
> > When SD_ASYM_CPUCAPACITY load balancing considers pulling a misfit task,
> > capacity_of(dst_cpu) can overstate available compute if the SMT sibling is
> > busy: the core does not deliver its full nominal capacity.
> >
> > If SMT is active and dst_cpu is not on a fully idle core, skip this
> > destination so we do not migrate a misfit expecting a capacity upgrade we
> > cannot actually provide.
> >
> > Cc: Vincent Guittot <vincent.guittot@linaro.org>
> > Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
> > Cc: Christian Loehle <christian.loehle@arm.com>
> > Cc: Koba Ko <kobak@nvidia.com>
> > Cc: K Prateek Nayak <kprateek.nayak@amd.com>
> > Reported-by: Felix Abecassis <fabecassis@nvidia.com>
> > Signed-off-by: Andrea Righi <arighi@nvidia.com>
> > ---
> > kernel/sched/fair.c | 11 ++++++++++-
> > 1 file changed, 10 insertions(+), 1 deletion(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 6f0835c15ee11..2ddba8bd27e59 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -9693,6 +9693,7 @@ struct lb_env {
> > int dst_cpu;
> > struct rq *dst_rq;
> > + bool dst_core_idle;
> > struct cpumask *dst_grpmask;
> > int new_dst_cpu;
> > @@ -10918,10 +10919,16 @@ static bool update_sd_pick_busiest(struct lb_env *env,
> > * We can use max_capacity here as reduction in capacity on some
> > * CPUs in the group should either be possible to resolve
> > * internally or be covered by avg_load imbalance (eventually).
> > + *
> > + * When SMT is active, only pull a misfit to dst_cpu if it is on a
> > + * fully idle core; otherwise the effective capacity of the core is
> > + * reduced and we may not actually provide more capacity than the
> > + * source.
> > */
> > if ((env->sd->flags & SD_ASYM_CPUCAPACITY) &&
> > (sgs->group_type == group_misfit_task) &&
> > - (!capacity_greater(capacity_of(env->dst_cpu), sg->sgc->max_capacity) ||
> > + (!env->dst_core_idle ||
> > + !capacity_greater(capacity_of(env->dst_cpu), sg->sgc->max_capacity) ||
> > sds->local_stat.group_type != group_has_spare))
> > return false;
> > @@ -11485,6 +11492,8 @@ static inline void update_sd_lb_stats(struct lb_env *env, struct sd_lb_stats *sd
> > unsigned long sum_util = 0;
> > bool sg_overloaded = 0, sg_overutilized = 0;
> > + env->dst_core_idle = !sched_smt_active() || is_core_idle(env->dst_cpu);
> > +
> > do {
> > struct sg_lb_stats *sgs = &tmp_sgs;
> > int local_group;
>
>
> This is kind of similar to what ASYM_PACKING would have done at MC domain with
> equal CPU capacities. i.e pull the load if the core is idle.
I think that's right, semantically "only pull a misfit to dst_cpu if its core is
idle" is essentially the same heuristics that SD_ASYM_PACKING ends up doing at
MC: prefer destinations on cores that can actually deliver their nominal
capacity. With equal per-CPU priorities the asym_packing path collapses to
"prefer the idle core", which is essentially what this patch enforces for the
misfit case.
>
> In your table in the cover-letter, if you do "NO ASYM + SIS_UTIL + ASYM_PACKING (at MC)"
> does it achieve close to "ASYM + SMT + SIS_UTIL"?
Christian already explored the "NO ASYM_CPUCAPACITY + SD_ASYM_PACKING" idea
(https://lore.kernel.org/all/20260325181314.3875909-1-christian.loehle@arm.com).
I gave it a spin on Vera at the time. Summarizing the numbers I reported on that
thread (all vs. baseline = default SD_ASYM_CPUCAPACITY, no SMT awareness, on my
CPU-bound workload):
- SD_ASYM_PACKING at MC (Christian's RFC): ~1.5x speedup
- equalize capacities within +/-5% (NO_ASYM): ~1.6x speedup
- SMT-aware SD_ASYM_CPUCAPACITY (PATCH 3/5): ~1.7x speedup
So SD_ASYM_PACKING seems to help, but not as much as NO_ASYM baseline (even if
it's pretty close) or this series.
I think the structural reason is that ASYM_PACKING at MC only fixes destination
selection in load balance, it doesn't change select_idle_capacity() /
asym_fits_cpu() on the wakeup path, where I think most of the placement
decisions actually happen in this case.
Thanks,
-Andrea
next prev parent reply other threads:[~2026-05-16 9:04 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-09 18:07 [PATCH v6 0/5 RESEND] sched/fair: SMT-aware asymmetric CPU capacity Andrea Righi
2026-05-09 18:07 ` [PATCH 1/5] sched/fair: Drop redundant RCU read lock in NOHZ kick path Andrea Righi
2026-05-11 13:04 ` Vincent Guittot
2026-05-15 6:49 ` Shrikanth Hegde
2026-05-16 5:45 ` Andrea Righi
2026-05-09 18:07 ` [PATCH 2/5] sched/fair: Attach sched_domain_shared to sd_asym_cpucapacity Andrea Righi
2026-05-11 13:04 ` Vincent Guittot
2026-05-15 10:05 ` Shrikanth Hegde
2026-05-16 5:58 ` [PATCH v2 " Andrea Righi
2026-05-09 18:07 ` [PATCH 3/5] sched/fair: Prefer fully-idle SMT cores in asym-capacity idle selection Andrea Righi
2026-05-11 13:07 ` Vincent Guittot
2026-05-11 13:45 ` Andrea Righi
2026-05-11 14:25 ` [PATCH v2 " Andrea Righi
2026-05-09 18:07 ` [PATCH 4/5] sched/fair: Reject misfit pulls onto busy SMT siblings on asym-capacity Andrea Righi
2026-05-11 13:07 ` Vincent Guittot
2026-05-15 10:09 ` Shrikanth Hegde
2026-05-16 9:04 ` Andrea Righi [this message]
2026-05-09 18:07 ` [PATCH 5/5] sched/fair: Add SIS_UTIL support to select_idle_capacity() Andrea Righi
2026-05-11 13:08 ` Vincent Guittot
-- strict thread matches above, loose matches on Subject: below --
2026-05-09 18:01 Andrea Righi
2026-05-09 18:01 ` [PATCH 4/5] sched/fair: Reject misfit pulls onto busy SMT siblings on asym-capacity Andrea Righi
2026-04-28 14:41 [PATCH v5 0/5] sched/fair: SMT-aware asymmetric CPU capacity Andrea Righi
2026-04-28 14:41 ` [PATCH 4/5] sched/fair: Reject misfit pulls onto busy SMT siblings on asym-capacity Andrea Righi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aggzGRmNzFfdPJdq@gpd4 \
--to=arighi@nvidia.com \
--cc=balbirs@nvidia.com \
--cc=bsegall@google.com \
--cc=christian.loehle@arm.com \
--cc=dietmar.eggemann@arm.com \
--cc=fabecassis@nvidia.com \
--cc=joelagnelf@nvidia.com \
--cc=juri.lelli@redhat.com \
--cc=kobak@nvidia.com \
--cc=kprateek.nayak@amd.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=pauld@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=sshegde@linux.ibm.com \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox