From: Andrea Righi <arighi@nvidia.com>
To: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
Juri Lelli <juri.lelli@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
Valentin Schneider <vschneid@redhat.com>,
Christian Loehle <christian.loehle@arm.com>,
Phil Auld <pauld@redhat.com>, Koba Ko <kobak@nvidia.com>,
Felix Abecassis <fabecassis@nvidia.com>,
Balbir Singh <balbirs@nvidia.com>,
Joel Fernandes <joelagnelf@nvidia.com>,
Shrikanth Hegde <sshegde@linux.ibm.com>,
linux-kernel@vger.kernel.org, tim.c.chen@linux.intel.com,
yu.c.chen@intel.com
Subject: Re: [PATCH v2 2/5] sched/fair: Attach sched_domain_shared to sd_asym_cpucapacity
Date: Tue, 19 May 2026 09:54:02 +0200 [thread overview]
Message-ID: <agwXGgTQEqF0sn6E@gpd4> (raw)
In-Reply-To: <55196e3b-ba1e-42c9-b80b-5c91306df452@amd.com>
On Tue, May 19, 2026 at 01:17:20PM +0530, K Prateek Nayak wrote:
> Hello Andrea,
>
> Thank you for taking a look at the diff!
BTW I just re-ran the NVBLAS benchmark on a Vera Rubin machine using
queue:sched/core + this on top, all good!
Thanks,
-Andrea
>
> On 5/19/2026 12:13 PM, Andrea Righi wrote:
> > Hi Prateek,
> >
> > On Tue, May 19, 2026 at 11:22:32AM +0530, K Prateek Nayak wrote:
> >> Hello Peter, Andrea,
> >>
> >> On 5/19/2026 2:28 AM, Peter Zijlstra wrote:
> >>> @@@ -2775,20 -3049,16 +3107,15 @@@ build_sched_domains(const struct cpumas
> >>> if (!sd)
> >>> continue;
> >>>
> >>> + if (has_asym)
> >>> - asym_claimed = claim_asym_sched_domain_shared(&d, i);
> >>> ++ claim_asym_sched_domain_shared(&d, i);
> >>> +
> >>> /* First, find the topmost SD_SHARE_LLC domain */
> >>> while (sd->parent && (sd->parent->flags & SD_SHARE_LLC))
> >>> sd = sd->parent;
> >>>
> >>> if (sd->flags & SD_SHARE_LLC) {
> >>> - /*
> >>> - * Initialize the sd->shared for SD_SHARE_LLC unless
> >>> - * the asym path above already claimed it.
> >>> - */
> >>> - if (!asym_claimed)
> >>> - init_sched_domain_shared(&d, sd);
> >>> - int sd_id = cpumask_first(sched_domain_span(sd));
> >>> -
> >>> - sd->shared = *per_cpu_ptr(d.sds, sd_id);
> >>> - atomic_set(&sd->shared->nr_busy_cpus, sd->span_weight);
> >>> - atomic_inc(&sd->shared->ref);
> >>> ++ init_sched_domain_shared(&d, sd);
> >>
> >> This will run into a small problem with "nr_idle_scan" if
> >> cpumask_first(sched_domain_span(sd)) is the same for both sd_asym and
> >> sd_llc.
> >
> > Ah, good catch! When cpumask_first(asym_span) == cpumask_first(llc_span)
> > (big.LITTLE typical case), both sd_asym->shared and sd_llc->shared would alias
> > to d->sds[0].
> >
> >>
> >> Load balancer at different domains will populate "nr_idle_scan" with
> >> different values and they alias to same ->shared if one isn't
> >> degenerated and I believe there is at least one way to hit the WARN_ON()
> >> from cpu_attach_domain() if the SD_ASYM_CPUCAPACITY_FULL comes before
> >> the last SD_SHARE_LLC domain and the latter is degenerated.
> >>
> >> How about this:
> >>
> >> (On top of queue:sched/core; Lightly tested on !ASYM_CPUCAPACITY system)
> >>
> >> diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
> >> index fe09d3268bc9..1d2c98dca211 100644
> >> --- a/include/linux/sched/topology.h
> >> +++ b/include/linux/sched/topology.h
> >> @@ -67,7 +67,15 @@ struct sched_domain_shared {
> >> atomic_t ref;
> >> atomic_t nr_busy_cpus;
> >> int has_idle_cores;
> >> - int nr_idle_scan;
> >> + union {
> >> + int nr_idle_scan;
> >> + /*
> >> + * Used during allocation to claim the
> >> + * sched_domain_shared object at
> >> + * multiple levels.
> >
> > I think between build and the first LB tick, readers of nr_idle_scan may observe
> > leftover SD_* flags in nr_idle_scan. This shouldn't be a problem and should
> > self-heal soon, but maybe it's worth a comment? Something like:
> >
> > * Note: between build and the first periodic LB tick, which
> > * rewrites the union via update_idle_cpu_scan(), readers of
> > * nr_idle_scan may observe the transient SD_* flag value as
> > * the scan bound. The flag bits are small positive integers,
> > * so the effect is just a slightly relaxed scan bound for one
> > * window and self-heals on the first tick.
>
> Ack! We start with 0 today which isn't representative of the system
> state either and depend on the eventual correctness to fix the value
> after a hotplug / cpuset.
>
> I can fold in the note and resend it as a formal patch.
>
> Peter, would you prefer a formal patch or would you like to do this
> (or something similar) as a part of the conflict resolution itself?
>
> >> + BUG_ON(!sd->shared);
> >
> > Unreachable in practice, but should we have a WARN_ON_ONCE() +
> > bail/early-return? In this way we'd fall back to using LLC's shared for
> > sd_balance_shared, which seems nicer than a BUG_ON().
>
> Ack! We can just use the last CPU's "sds" if we don't end up finding a
> free one as a backup. I just had the BUG_ON() to easily spot my VM
> crashing ;-)
>
> --
> Thanks and Regards,
> Prateek
>
next prev parent reply other threads:[~2026-05-19 7:54 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-09 18:07 [PATCH v6 0/5 RESEND] sched/fair: SMT-aware asymmetric CPU capacity Andrea Righi
2026-05-09 18:07 ` [PATCH 1/5] sched/fair: Drop redundant RCU read lock in NOHZ kick path Andrea Righi
2026-05-11 13:04 ` Vincent Guittot
2026-05-15 6:49 ` Shrikanth Hegde
2026-05-16 5:45 ` Andrea Righi
2026-05-16 17:15 ` Shrikanth Hegde
2026-05-20 8:34 ` [tip: sched/core] " tip-bot2 for Andrea Righi
2026-05-21 19:47 ` [PATCH 1/5] " Marek Szyprowski
2026-05-21 20:13 ` Andrea Righi
2026-05-09 18:07 ` [PATCH 2/5] sched/fair: Attach sched_domain_shared to sd_asym_cpucapacity Andrea Righi
2026-05-11 13:04 ` Vincent Guittot
2026-05-15 10:05 ` Shrikanth Hegde
2026-05-16 5:58 ` [PATCH v2 " Andrea Righi
2026-05-16 17:19 ` Shrikanth Hegde
2026-05-18 20:58 ` Peter Zijlstra
2026-05-18 21:31 ` Andrea Righi
2026-05-19 5:52 ` K Prateek Nayak
2026-05-19 6:43 ` Andrea Righi
2026-05-19 7:47 ` K Prateek Nayak
2026-05-19 7:54 ` Andrea Righi [this message]
2026-05-19 8:46 ` Peter Zijlstra
2026-05-19 11:27 ` K Prateek Nayak
2026-05-19 11:47 ` Peter Zijlstra
2026-05-25 8:30 ` Chen, Yu C
2026-05-20 8:34 ` [tip: sched/core] " tip-bot2 for K Prateek Nayak
2026-05-09 18:07 ` [PATCH 3/5] sched/fair: Prefer fully-idle SMT cores in asym-capacity idle selection Andrea Righi
2026-05-11 13:07 ` Vincent Guittot
2026-05-11 13:45 ` Andrea Righi
2026-05-11 14:25 ` [PATCH v2 " Andrea Righi
2026-05-20 8:34 ` [tip: sched/core] " tip-bot2 for Andrea Righi
2026-05-09 18:07 ` [PATCH 4/5] sched/fair: Reject misfit pulls onto busy SMT siblings on asym-capacity Andrea Righi
2026-05-11 13:07 ` Vincent Guittot
2026-05-15 10:09 ` Shrikanth Hegde
2026-05-16 9:04 ` Andrea Righi
2026-05-20 8:34 ` [tip: sched/core] " tip-bot2 for Andrea Righi
2026-05-09 18:07 ` [PATCH 5/5] sched/fair: Add SIS_UTIL support to select_idle_capacity() Andrea Righi
2026-05-11 13:08 ` Vincent Guittot
2026-05-20 8:34 ` [tip: sched/core] " tip-bot2 for K Prateek Nayak
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=agwXGgTQEqF0sn6E@gpd4 \
--to=arighi@nvidia.com \
--cc=balbirs@nvidia.com \
--cc=bsegall@google.com \
--cc=christian.loehle@arm.com \
--cc=dietmar.eggemann@arm.com \
--cc=fabecassis@nvidia.com \
--cc=joelagnelf@nvidia.com \
--cc=juri.lelli@redhat.com \
--cc=kobak@nvidia.com \
--cc=kprateek.nayak@amd.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=pauld@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=sshegde@linux.ibm.com \
--cc=tim.c.chen@linux.intel.com \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
--cc=yu.c.chen@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox